Overview

Using the Job Parser🔗︎

To use the Job Parser, simply pass in your job document through our API and the parser will extract multiple data points such as employer name, job location, job description, required skills, required degree, start date, etc. (for a full list of fields, please refer to the Parse a Job API Reference. In addition, the job parser can also output the original document in a variety of formats including HTML, PDF and RTF.

Optical Character Recognition🔗︎

Optical Character Recognition (OCR) ensures that scanned or photographed documents are automatically detected and converted into text, so that the text can be parsed. Without OCR, such documents cannot be parsed. On average, approximately 5% of documents require OCR.

OCR does not impact the response time for non-image documents. However, for documents that do require OCR, the response time is higher due to the computational intensity involved with converting images to text. Because of this computational intensity, we also limit OCR to 10 pages and will stop processing the document after 120 seconds.

When OCR is enabled, a small additional transaction cost is incurred with each parsing transaction. Scans and full image documents are then auto-detected and OCR is applied when necessary. OCR can be configured in the Tx Console.