Services: Content Indexing, Tagging, and Classification Consulting
Business Context
Creating an enterprise taxonomy is only the first step in improving information access in a large organization. Indexing and tagging the enterprise’s content in order to embed it in the taxonomic categories of the existing corpus is a complex technical task that requires significant time and attention. We have found that this step can be a sticking point for many companies and their IT organizations. Many IT specialists are unaware that best practices and lessons learned can be applied to this important task.Typical Outcomes and Deliverables
SearchScience technology and taxonomy experts provide experience-based insights directly to your IT specialists responsible for tagging, indexing and classifying intranet content. We also provide outsourcing arrangements for the initial corpus indexing and tagging for you, if necessary.Our experience has shown that the best solutions involve a combination of manual and automatic classification techniques, combining the insights of business experts with the speed and clarity of rules-based automatic classification methodologies. We have found that statistical, neural network, and other training-based pattern-matching techniques do not yet yield results of high enough quality to replace the need for a combined manual and rules-based approach.
Client Sponsorship and Participation Required
Indexing, tagging, and classification are usually perceived as technical issues that do not require significant business sponsorship to carry forward. Our experience, in contrast, has been that business sponsorship is critical even for this highly technical task, because it requires a commitment by IT to dedicate resources to a difficult and time-consuming task.SearchScience Expertise and Qualifications
Our technology specialists are experts at developing and deploying technologies and processes for corpus content extraction, indexing, taxonomy-based tagging and content exposure (search, browse, label, etc.).Timeframes
A content tagging, indexing and classification engagement typically lasts between 6 and 12 weeks, depending on the size and complexity of the corpus being processed.


