Detect File Types from Content (Not Just Extensions)
File extensions lie. A .jpg might actually be a PHP shell, and a .pdf might be a ZIP archive. The File Type service inspects the first bytes of a file — its magic number — to determine the true content type.
What You'll Learn
- How to detect image types by sending base64-encoded content
- How PDF detection works with magic byte signatures
- Why content-based detection is more secure than trusting file extensions
Prerequisites
Before you start: You need an API key. Follow the Platform Quick Start to get one.
curl(or any HTTP client)- A valid API key in the
X-API-KEYheader - A sample file to test (or use the examples below)
Step 1: Detect a PNG Image
Send the first 64+ bytes of a file as base64. A PNG always starts with the bytes 89 50 4E 47.
# Encode the first 256 bytes of a local image
SAMPLE=$(head -c 256 photo.png | base64 -w0)
curl -X POST /api/file-type/detect \
-H "Content-Type: application/json" \
-H "X-API-KEY: your-api-key" \
-d "{\"content\": \"$SAMPLE\"}"
import base64, requests
with open("photo.png", "rb") as f:
sample = base64.b64encode(f.read(256)).decode()
resp = requests.post(
"/api/file-type/detect",
headers={"X-API-KEY": "your-api-key"},
json={"content": sample}
)
print(resp.json())
{
"status": "OK",
"data": {
"detectedType": "image/png",
"extension": "png",
"confidence": 1.0,
"magicBytes": "89504E47",
"description": "PNG image"
}
}
A confidence of 1.0 means the magic bytes are an exact match for a known signature.
Step 2: Detect a PDF Document
PDF files start with %PDF (hex 25504446). Even if someone renames a PDF to .docx, the magic bytes reveal the truth.
SAMPLE=$(head -c 256 report.pdf | base64 -w0)
curl -X POST /api/file-type/detect \
-H "Content-Type: application/json" \
-H "X-API-KEY: your-api-key" \
-d "{\"content\": \"$SAMPLE\"}"
{
"status": "OK",
"data": {
"detectedType": "application/pdf",
"extension": "pdf",
"confidence": 1.0,
"magicBytes": "25504446",
"description": "PDF document"
}
}
Step 3: Why Magic Numbers Beat Extensions
Relying on file extensions creates real security holes:
| Scenario | Extension Says | Magic Bytes Say |
|---|---|---|
| Renamed PHP webshell | .jpg |
text/x-php |
| ZIP disguised as document | .pdf |
application/zip |
| EXE with double extension | .pdf.exe |
application/x-dosexec |
Security rule: Never trust user-supplied file extensions for access control or rendering decisions. Always verify with content-based detection.
Integration Tips
- Upload validation: Check files before saving to storage. Reject mismatched extensions to block disguised malware.
- Minimal payload: You only need the first 256 bytes. No need to upload the entire file.
- Content-Disposition headers: Use the detected MIME type when serving files back to users.
- Combine with virus scanning: Detect type first, then route to the appropriate scanner based on MIME type.
Next Steps
- Full API Reference — all supported MIME types and signatures
- Deduplication Tutorial — hash files after type detection to find duplicates
- Screenshot Tutorial — capture pages and verify the output is a valid image
- Try It Live — test file detection in your browser