Documentation Index
Fetch the complete documentation index at: https://cernio.gadulabs.com/llms.txt
Use this file to discover all available pages before exploring further.
AI Discovery Pipeline Overview
Buyer discovery system is implemented as a multi-stage AI pipeline.
Instead of performing a simple search query, the system executes a sequence of structured reasoning steps.
Pipeline structure:
Product Input
↓
Product Analysis
↓
Query Expansion
↓
Company Retrieval
↓
Entity Extraction
↓
Company Enrichment
↓
Segment Classification
↓
Deduplication
↓
Scoring Engine
↓
Ranking
↓
Decision Maker Discovery
Each stage transforms raw web data into increasingly structured information.
Product Analysis
The discovery pipeline begins with product understanding.
The system must interpret the product description provided by the user.
Example input:
textile stain remover spray
The AI system extracts structured product attributes.
Example output:
| attribute | value |
|---|
| industry | textile chemicals |
| category | stain remover |
| product type | aerosol chemical |
| application | textile manufacturing |
| synonyms | textile spot remover |
The goal is to build a product knowledge profile.
Query Expansion
Once the product context is understood, the system generates multiple search queries.
This step is essential because exporters rarely know the exact keywords used by buyers.
Example generated queries:
- textile chemical distributor Germany
- textile auxiliaries distributor Germany
- garment factory chemical supplier Germany
- Textilchemie Händler Deutschland
- Textilchemikalien Vertrieb Deutschland
The system generates queries in:
- English
- local language
- industry terminology
This dramatically increases discovery coverage.
Retrieval Layer
The retrieval layer collects candidate companies from multiple sources.
Sources include:
- web search results
- industry directories
- company websites
- trade portals
- LinkedIn company pages
- trade association lists
Example discovery result:
| company | website | country |
|---|
| TextilChem GmbH | textilchem.de | Germany |
| ChemTex Solutions | chemtex.eu | Germany |
| GarmentAux Trading | garmentaux.com | Germany |
At this stage the data is noisy and unverified.
The system extracts structured company information from raw results.
Fields extracted include:
| field | example |
|---|
| company_name | TextilChem GmbH |
| website | textilchem.de |
| country | Germany |
| description | Textile chemical distributor |
Extraction uses:
- HTML parsing
- LLM interpretation
- structured prompt extraction
Company Enrichment
The next step is to understand what the company actually does.
Questions the AI attempts to answer:
- Is this company a distributor?
- Do they serve the textile industry?
- Do they sell chemicals?
- Are they manufacturers or traders?
Example enrichment result:
| company | role | sector |
|---|
| TextilChem GmbH | distributor | textile chemicals |
| ChemTex Solutions | supplier | garment auxiliaries |
Segment Classification
Companies are classified into segments.
Segment definitions:
| segment | meaning |
|---|
| S1 | ideal distributor |
| S2 | potential buyer |
| S3 | related company |
Example:
| company | segment |
|---|
| TextilChem GmbH | S1 |
| ChemTex Solutions | S2 |
| IndustrialTrade AG | S3 |
Segment classification helps prioritize results.
Deduplication
Because multiple queries may return the same companies, duplicates must be removed.
Deduplication checks include:
- domain match
- company name similarity
- address similarity
Example duplicates:
- TextilChem GmbH
- TextilChem GmbH & Co KG
- textilchem.de
All are merged into a single entity.
Scoring Engine
After enrichment, companies are ranked using a deterministic scoring model.
Example scoring formula:
FitScore =
IndustryMatch × 0.40
DistributorProbability × 0.30
CountryMatch × 0.20
CompanySize × 0.10
Where:
| factor | description |
|---|
| IndustryMatch | does company operate in target industry |
| DistributorProbability | likelihood of distribution role |
| CountryMatch | geographic relevance |
| CompanySize | operational capacity |
Example Scoring Table
| company | industry match | distributor prob | score |
|---|
| TextilChem GmbH | 0.95 | 0.85 | 0.90 |
| ChemTex Solutions | 0.80 | 0.60 | 0.73 |
| IndustrialTrade AG | 0.55 | 0.40 | 0.52 |
Ranking
After scoring, companies are ranked.
Example result set:
| rank | company | score |
|---|
| 1 | TextilChem GmbH | 0.90 |
| 2 | ChemTex Solutions | 0.73 |
| 3 | IndustrialTrade AG | 0.52 |
The product UI displays:
Top 25 companies
But visually highlights:
Top 5 likely buyers
Decision Maker Discovery
For top-ranked companies, the system attempts to identify decision makers.
Sources:
- LinkedIn
- company websites
- public directories
Target roles:
- purchasing manager
- import manager
- procurement director
- owner
Example contact record:
| name | title | email |
|---|
| Anna Müller | Purchasing Manager | anna@textilchem.de |
Feedback Loop
Users can rate the relevance of companies.
Options:
- Relevant
- Maybe
- Not relevant
This feedback improves future ranking.
Example learning signal:
product: textile chemicals
country: Germany
company: TextilChem GmbH
feedback: relevant
Over time this builds a relevance dataset.
Accuracy Framework
Discovery quality must be measured.
Metrics:
| metric | meaning |
|---|
| accuracy score | relevance ratio |
| noise ratio | irrelevant results |
| distributor recall | true distributor coverage |
Launch thresholds:
- Accuracy > 0.70
- Noise < 15%
| stage | target latency |
|---|
| product analysis | <2 sec |
| query expansion | <2 sec |
| retrieval | <5 sec |
| enrichment | <5 sec |
| ranking | <1 sec |
Total discovery time:
<15 seconds
This ensures a fast user experience.
Discovery Pipeline Summary
The discovery system converts:
Product description
↓
Structured buyer candidates
↓
Ranked companies
↓
Decision makers
This capability forms the core competitive advantage of the platform.