Automated Entity Extraction with LLMs

Dirceu Jr.
1 min readSep 18, 2023

--

The year was 2009. I was searching for a house to rent. As an aspiring Ruby programmer, I had the idea of automatically accessing real estate websites and extracting information about the properties listed, like the address and plotting it on a map.

NLP (Natural Language Processing) was already a hot topic but the solution I managed to develop depended on writing rules, XPath, or CSS Selectors by hand, a different one for each real estate agency.

Years passed, and I knew it would be possible to do this more automatically, without specific rules for each site, but I didn’t know how. Maybe the big companies knew. Possibly, there is an API like AlchemyAPI. But only now, with the release of GPT-3.5 did I get it.

LLMs (Large Language Models), like OpenAI’s GPT (Generative Pre-trained Transformer), have numerous capabilities, and one of them is its use for NLP and Automated Entity Extraction.

How

If you copy the content of a web page like an e-commerce product list or a real state properties list and send it to GPT-3.5 or GPT-4 with a prompt like “parse and return product names in JSON format” or “parse and return properties in JSON format” you will be amazed at the results. Much like other use cases surprises everyone who tried Artificial Intelligence and its capabilities. Try it yourself.

--

--

Dirceu Jr.
Dirceu Jr.

Written by Dirceu Jr.

software developer, tech enthusiast, applied artificial intelligence student 👨‍💻 developer of 🌐 http://coora-ai.com 🧭 http://igapo.tech

No responses yet