Analyze, extract, and sanitize text content at scale. The Text Analysis category provides three APIs that cover readability assessment, keyword extraction, and sensitive data redaction — the most common NLP tasks that every content pipeline needs.
Use these services to enforce editorial standards by scoring readability before publication, generate SEO keyword suggestions from article bodies, or strip PII from log streams before they land in your data lake. Each API is stateless and processes text in real time.
Score any block of text against four industry-standard readability indices: Flesch-Kincaid Reading Ease and Grade Level, Gunning Fog Index, Coleman-Liau Index, and the Automated Readability Index.
Also returns word count, sentence count, and average words per sentence. Integrate it into your CMS to enforce a target grade level before content goes live.
Extract the most relevant keywords and key phrases from any text using a combination of TF-IDF weighting and the RAKE (Rapid Automatic Keyword Extraction) algorithm. Each keyword is returned with a relevance score.
Configure the maximum number of keywords, and the service automatically detects the input language. Use it for SEO analysis, document tagging, or building topic clusters.
Detect and replace sensitive data in log entries and free-form text. Built-in detectors cover email addresses, IP addresses, credit card numbers, phone numbers, and API keys. Each detection type can be independently enabled or disabled.
Add custom regex patterns for domain-specific secrets. Returns a redacted string plus a summary of what was found, making it easy to audit compliance in your log pipeline.