
Mine the data around you with darzar. You don’t need any manual work for this, our AI will adapt.
darzar will automatically extract data from databases, files, websites, emails and even images. It will find items like: products, services, company data, messages, conversations, forums, events, news, articles, books, movies, recommandations.
Benefits
- Automatic: no manual work, no training on updated sites
- Immediate: no specialized training delays, no manual tuning
- Simple: start from the website address
- Volumes: fast, accurate, big volumes
- Insights via statistics, automatic catalogs, hierarchies, taxonomies
- You have a knowledge treasure ready to be mined: ready for machine learning, dashboards, BI
- GDPR preparation by finding sensitive personal data
Features
Automatic Bots
- The automatic, adaptable Artificial Intelligence Bots will deliver faster, be more accurate and replace error prone, costly human resources
Extract Everything
- Extract data valuable for your business: companies, images, discussions, opening hours, articles, products, services, company data, messages, conversations, forums, events, news, articles, books, movies, recommandations
Data extraction
- Structured: databases, nosql databases (graph, document, mongodb, cassandra, redis)
- Semi-structured: mobile apps, json, csv, xsl, excel, xml, exports, metadata, microdata, open graph, twitter cards
- Unstructured Text Data: emails, docs(pdf,pdfa,word), websites, files, text, markdown, micro-languages
- Fully Unstructured: scan(jpg, png, tiff, pdf), mobile scan - extracted with OCR, multi-language text recognition, barcode, matrix code, classification
Phases
- Seeding - data source discovery
- Crawl - exhaustive mining of raw data
- Scrap - data extraction from downloaded raw data
- Correction - crop, smart crop, filtering, rescanning
- Transform - consolidate, standardize, cleanse, reconcile data
- Import Sources - files (including Dropbox, Google Drive), synology, ftp, links, emails, eCommerce shops, websites
- Search - labels, metadata, filters, automatic classification
- Store - Backup, History, Versioning, Time series
- Structure - clustering, automatic catalogs, hierarchies, taxonomies, websites specifics, updates, changes, structure, comparisons, history
- Consume - dashboards, BI, export as structured data
Demo
- Dashboard to explore discovered data at: www.darzar.com
Technology Stack
- BigData: spark
- Scala, Akka, Playframework, Java, Bootstrap4, Responsive