Search index configuration for FBA MOSS
I am working with a client on a public facing internet site that includes a public area for unauthenticated users and a ‘members’ area for which there is already a bespoke SQL database holding all members’ credentials (including clear text passwords!) Given the tight timeframe and the amount of bespoke development required for other areas of the site I decided that it would be best to use the standard .NET Framework forms based authentication (FBA). The site owners manually process new member applications and maintain the existing database so we will create an additional administration screen to maintain the FBA database via SharePoint. The site owners would use Windows authentication (NTLM) via an ’authoring.companyname.co.uk’ URL and members will use FBA via an ’internal.companyname.co.uk’ URL.
At least two document libraries will be included in the members area of the site but they will not be directly exposed to members. Instead, each library will have a dedicated page on the site where members can browse through all the documents in the library or submit a basic search. The documents will be displayed in a formatted table (grid) for both browsing and search results.
As the CSS formatting of the page was so bespoke I decided that my starting point would be a clone of a WCM page from elsewhere on the site to which I would add an ASP.NET GridView control. When a user first visits the page it displays all the documents in the related document library by binding the grid (called searchResultsTable) to an SPDataSource object that has had the List property set to reference the document library.
figure 1 – binding to an SPDataSource
That was the easy part. To perform the search I added a text box and an image button as per the following screen shot.
figure 2 – formatted results with paging
When the user submits a search the grid is bound to the results of an SPQuery that has been loaded into a DataTable. The important thing to note here is that the AuthenticationType property of the SPQuery object must be correctly set depending on whether the user is NTLM or FBA authenticated as per the highlighted section in the code below.
figure 3 – binding to an SPQuery filled DataTable
Before any of this will work the search indexer has to be configured to index the FBA version of the site. The NTML site is automatically included in the default search content source (typically named ‘Local Office SharePoint Server sites’) and therefore any documents stored anywhere in the site are automatically indexed. However, FBA authenticated users trying to search the site don’t see anything in that index as they are accessing the site through a separate URL. I therefore added a search content source that I dedicated to indexing the site via the FBA URL.
figure 4 – adding a dedicated FBA search content source
As per the code in figure 3 the query passed into the SPQuery object via the QueryText property specifies the name of the search scope to be queried similar to this (where x, y, z are the fields to return and scope_name is the name of the search scope);
SELECT x, y, z FROM Scope() WHERE "scope"=’scope_name’
I therefore needed to set up a search scope for the document library but ensure it worked from both the NTLM and FBA sites.
figure 5 – document library search scope for NTLM & FBA
The final piece of the puzzle is ensuring that the search indexer has access to the FBA site. As per the following screen shot, on a standard SharePoint Shared Service Provider the default content access account (used for indexing site content) is an NTLM account.
figure 6 – default content access account
Therefore the default content access account cannot create an index that FBA users can query. I tried to add a crawl rule that would use a specific FBA account to crawl the FBA site.
figure 7 – add a crawl rule
The FBA site is currently using the default SharePoint FBA log in page which unauthenticated users are automatically redirected to when they first attempt to access an area of the site that requires authentication. I entered the URL of the log in page (http://internal.companyname.co.uk/_layouts/login.aspx) in the Form URL field (under the Specify Authentication section) but when I clicked the Enter Credentials button I got a 403 error.
figure 8 – 403 error when entering FBA credentials
I noticed that every time a user is redirected to the log in page a ‘ReturnUrl’ query string is appended to the URL something like this:
When I used this extended URL on the Add Crawl Rule page and and clicked the Enter Credentials button the site log in page was correctly displayed. I entered the credentials of the FBA search content access account and was redirected to the default FBA site page. The following dialog popped up asking if the log in had succeeded.
figure 9 – successful log in with FBA credentials
After clicking OK I was returned to the Add Crawl Rule page and a message (highlighted below) indicated that the credentials had been accepted.
figure 10 – apparently correct FBA credentials
However, when I clicked OK I got an error telling me the credentials had either not been entered or had not been accepted?!
figure 11 – credentials not accepted?!
This got very frustrating. I couldn’t find anything useful in my excellent reference book Inside the Index and Search Engines: Microsoft Office SharePoint Server 2007 because it is nearly 2 years old and there have been some significant advances since then such as the infrastructure updates. Just like my book, the most relevant information I got from searches was about using the addrule.exe command line tool which didn’t seem right for an installation up-to-date with service packs and hotfixes.
One blog post I found explaining how to add a crawl rule caught my attention because of the log in page URL that was specified. It was similar to the URL I had used but the ‘ReturnUrl’ query string parameter was notably shorter (just ‘%2f’) so I tried creating the crawl rule again using the shortened URL:
This time everything worked and the crawl rule was successfully added!
figure 12 – crawl rule added
With the crawl rule in place the indexer had FBA credentials and details of the log in page but the FBA account still needs to be given permission to read all the content in the FBA site. This is accomplished in exactly the same way that the default access account is granted rights to browse an entire (NTLM) site, by means of a policy. From the Application tab in Central Administration click on the Policy for Web application link.
figure 13 – Policy for Web application
On the Policy for Web application page click the Add Users button and ensure that the appropriate Web Application is selected. Select the zone that the FBA site is running under then click the Next button. Then enter the FBA user account and check the “Full Read – Has full read-only access” check box before clicking the Finish button.
And finally I could run a full crawl against the FBA site content source…
figure 14 – start a full crawl of the FBA site
… and watch the index grow!
figure 15 – finally the FBA site is indexed!