Friday, May 25, 2007

RSKEYMGMT command for failover to standby TFS server

We're a daring group of folks who have added a standby TFS App Tier server to our topology. We'd like to see TFS support true failover, but for now we're comfortable with a warm backup that with some manual intervention we can get up and running if need be.

We followed the How to: Activate a Fail-Over Application-Tier Serverto a tee and as it turned out, that was our downfall. I'm not saying the documentation is wrong, all I'm saying is we had to modify step 4) under Reporting Services to get our standby up and running.

After a number of hours, we broke down and called Microsoft Support who directed us to a very competent TFS support engineer. He correctly diagnosed the issue as being Reporting Services related and joined us with another very competent Reporting Services support engineer. Microsoft should be proud of the work they both did. Great support is one of the primary reasons we buy commercial software/support.

The issue was regarding the command "RSKEYMGMT –a –i <instance ID for AT2> -f c:\backups\My_RSBackup_TFS_AT01 -p aPassword". We got the error "Unable to locate the Report Server Windows service for instance <instance ID>". From what I understand, since the TFS install requires Reporting Services to be installed as a "default" instance (see step 10 in "How to: Install Microsoft SQL Server 2005 Reporting Services for Team Foundation Server (Dual-Server Deployment)" which is located in the TFS install documentation.) you can't activate it by a named instance name. Thus the "–i " switch in the command was throwing the error.

Removing the –i and running the command "RSKEYMGMT –a -f c:\backups\My_RSBackup_TFS_AT01 -p aPassword" on the standby App Tier worked just fine.

Although the process is not perfect, we're now able to failover. Hope this helps if you run into the same issue.

Thursday, May 24, 2007

TFS Data Warehouse Not Being Updated by TFS TFSServerScheduler

May 2007 - We had an ongoing issue where the TFS Data Warehouse was not updated by TFS TFSServerScheduler but could be invoked manually. What we think happened is when we installed TFS to a different port (i.e. 8888) and went through the process of putting our App Tier behind a Fully Qualified Domain Name, not all of the Registry Entries were updated as needed.

With the help of this posting and some sleepless nights we sifted through the App Tier's registry and found a number of references to AppTier-B01:8080 instead of AppTier-B01:8888 or better tfs.int.mycompany.com:8888 which is a DNS entry for AppTier-B01:8888. We also found a number of duplicate references where both AppTier-B01:8080 and tfs.mycompany.com:8888 were listed. We logged on as the TFSInstall, TFSService, and TFSReports account and did the clean-up as some of the settings are under HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\8.0\TeamFoundation\Servers.

The lesson learned is to make sure all your registry settings are correct for ALL USERS. Because our AppTier name did not change (we just put it behind DNS), we think the most important piece was the port number which we changed from 8080 to 8888. Below are our current settings on the App Tier.

Sys Admin Account:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\VisualStudio\8.0\TeamFoundation\Servers
tfs.mycompany.com = http://tfs.mycompany.com:8888

HKEY_CURRENT_USER\SOFTWARE\Microsoft\VisualStudio\8.0\TeamFoundation\Servers
tfs.mycompany.com = http://tfs.mycompany.com:8888

TFSService Account:

HKEY_CURRENT_USER\SOFTWARE\Microsoft\VisualStudio\8.0\TeamFoundation\Servers
tfs.mycompany.com = http://tfs.mycompany.com:8888

Wednesday, May 23, 2007

Can you have TFS states and groups with the same name?

May 2007 - We've been dealing with the very informative user error message "TF26212: Team Foundation Server could not save your changes. There may be problems with the work item type definition. Try again or contact your Team Foundation Server Administrator." for about two weeks now and we finally think we have an answer.

The error seemed to show up when we had a state (e.g. Testing) named the same as a group (e.g. Testing). Here are our steps and final thoughts on how we resolved the issue.

Start by editing the Web.config file under \Program Files\Microsoft Visual Studio 2005 Team Foundation Server\Web Services by changing

name="traceLevel" value="1" to name="traceLevel" value="4"

and

key="traceWriter" value="false" to key="traceWriter" value="true"

This will start logging myriad amounts of information under C:\WINDOWS\Temp\TFLogFiles (or some other directory if you change the defaults).

After you make these changes (you don't need to bounce IIS) try to get the error again. If you see an error like below in the TFLogFiles you may be in the same boat as us.

[WI] [Error, 2916, 5, 11:31:33.016] SvrEx: Microsoft.TeamFoundation.WorkItemTracking.Server.ValidationException: Forcing rollback ---> System.Data.SqlClient.SqlException: Forcing rollback
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection)
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj)
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
at System.Data.SqlClient.SqlDataReader.HasMoreRows()
at System.Data.SqlClient.SqlDataReader.ReadInternal(Boolean setTimeout)
at System.Data.SqlClient.SqlDataReader.NextResult()
at Microsoft.TeamFoundation.WorkItemTracking.Server.PayloadTableCollection.Populate(SqlDataReader reader)
at Microsoft.TeamFoundation.WorkItemTracking.Server.SqlAccess.ExecuteBatchPayloadImpl(IRequestContext context, String sqlBatch, List`1 parameterList, Boolean& errorOnBulkUpdate, String connectionString)
--- End of inner exception stack trace ---.


While we've had differing results with a number of use cases, the root cause seems to be related to having a state and group named the same. We saw the issue with a state called CCB and group called CCB and the same issue with a state called Testing and a group called Testing.

What's even more alarming is they don't even have to be in the same project. We're pretty sure if you have a Testing group in Project1 and a Testing state in Project2, you will get a conflict.

Again we're not 100% sure this should be written in stone, but our error has disappeared now that we renamed the CCB state to Change Control Board and Testing group to Testers. And removed all references to the CCB state and Testing group from all the projects in our instance. This seemed to work for us.

Good luck!

Friday, May 11, 2007

Report Server Windows Service (MSSQLSERVER) cannot connect to the report server database

May 2007 - We just got our standby server setup and failed over for the first time. I'll send out the steps we took in a forthcoming post as we're still working out some minor details.

We were a bit concerned as after setting up the standby server we were getting the following error messages (below) in Event Viewer. The first error message just happened once (or so we can tell). The second one was recorded every minute in Event Viewer on the standby.

When we failed over the to standby both errors went away and started to appear on the primary. I posted a message to the MSDN forums and Mr. Chen confirmed this was as expected. I'm not sure I followed him on this, but he said the reason was "This is usually used to improve production quality in the future." Again I'm not sure I follow him on why this will help to improve quality, but I'll take his word for it.

Happy failover!

Error Messages:

Report Server Windows Service (MSSQLSERVER) has not been granted access to the catalog content.

Report Server Windows Service (MSSQLSERVER) cannot connect to the report server database.

Thursday, May 03, 2007

HTTP Status 400 With Reporting Services

May 2007 - There are probably a myriad of reasons to get the error "The request failed with HTTP status 400: Bad Request" in Reporting Services running with TFS, but here is why we got the error.

While trying to get our TFS instance behind a fully qualities DNS name we added tfs.int.mycompany.com to host header entry on the Default Web Site on our TFS App Tier. After adding this entry users could still see Reports in Visual Studio, but when Right Clicking and selecting Show Report Site Internet Explorer would pop up and give us the error ""The request failed with HTTP status 400: Bad Request". As a side note, adding this entry in the host header did nothing for us while trying to get our App Tier behind DNS.

Once we removed that value, the error went away. Hope this helps anyone else.

Tuesday, May 01, 2007

Our own Dog Food Statistics

May 2007 - What a month this has been. We had some issues getting TFS up and running (see my previous post for some of the bigger ones), but finally we have a PROD system that is capable of taking on users. For those of you who are thinking about implementing TFS, make sure you have dedicated resources to work on it. We're fortunate that our company is funded a full time administrator to get this up and running for 200+ users (1 admin --> 200 users). If you're a Sys Admin who, in addition to your day-to-day responsibilities, is supposed to get TFS up and running, good luck!

I'm a bit of a statistics nut so while we currently wait for Brian Harry to release the TFSServerManagerTool (http://blogs.msdn.com/bharry/archive/2007/01/22/tfsservermanager-powertool.aspx) I thought I'd try to pull some simple statistics that we might find valuable and post them. Being our first project was created today, it seems like an apropos time to make this posting. I'm a bit pressed for time so I didn't have the opportunity to put together the myriad of statistics that Brian has. I'll wait for him to release the TFSServerManager PowerTool to get those crazy numbers. That being said, we find this information helpful so here it goes.

Team Projects = 1; This was a pretty easy one to find!

Users = 6; This one was a bit tougher. I started by running the TfsSecurity tool (http://msdn2.microsoft.com/en-us/library/ms252504(VS.80).aspx) with the "/imx all:" parameter. Unfortunately this gave us back all the users plus the 50 Sys Admins that also have access to the machine, but would never use TFS with their administration accounts (e.g. MyDomain\M0000001). Consequently we don't want to count them. The good thing is our user accounts all start with \U (U for user they tell me) so I was able to find actual TFS users by running "TfsSecurity /server:tfs.int.mycompany.com /imx n:"[SERVER]\Team Foundation Valid Users" find "MyDomain\U"". This gave us back the number of MyDomain\U users (e.g. MyDomain\U000001) which are the only users we add to TFS projects.

Work items = 16; This was also easy as each new work item creates a new incremented id. There are 15 standard CMMI process template Tasks that get created. After that we created one to change a Change Request state which was number 16.

Files/Folders = 1859/31; I got these values using the following commands. For files I ran "tf dir /server:tfs.int.mycompany.com $/ /recursive" less the value I got for the number of folders, which is next. For folders I ran "tf dir /server:tfs.int.mycompany.com $/ /folders /recursive".

Changesets = 7; Another easy one for now. I got this value by running "tf changeset /server:tfs.int.mycompany.com /latest /noprompt"

That's it for now. If you find additional or better ways to find your own Dog Food statistics before Brian releases TFSServerManagerTool, let us know.