Data Factory

Welcome to Spain's biggest Data Source

Data Factory

Welcome to Spain’s biggest Data Source

Create customised data to suit your needs

If you don’t find the information you need in our datasets, we can create them on demand. If the information you need is on various sites on the internet that you need to combine with your own information or open data, we can probably help you. With our web crawling, web scraping and data modelling capabilities, we can create data tailored to your specific needs.

If you need to combine information from different sources, either from websites or your own data, we can help you create the file you need for your data solution. We call it Data Factory.

Data Factory processes

Identification of the data to be processed

Automation of data extraction

Basic data processing (normalisation, deduplication)

Data tagging (or dictionary and taxonomy creation)

Creating a knowledge graph

(Optional)

Simple or complex algorithm applied to data (AI/ML)

Integration of the results for consumption: Publications, enrichment of other systems (BI, CRM...)

Verification of the integrity of the information

Identification of the data to be processed

Data extraction automation

Basic data processing (normalization, deduplication)

Data labeling (or dictionary and taxonomy creation)

Creating a knowledge graph

(Opcional)

Simple or complex algorithm applied to data (AI/ML)

Integration of the results for consumption: Publications, enrichment of other systems (BI, CRM...)

Verification of the integrity of the information

Applications of our Data Factory services

A technique with so much potential, only you can know how it can be useful in your business. Tell us your problem and we will advise you. Some applications it has for our clients:

Dataset_a_medida2.png
Custom dataset

Creation of high quality databases customized according to your needs from the internet..

reputacion2.png
Brand monitoring

Monitor your brand reputation and track mentions and reviews online automatically.

investigacion2.png
Market research

It includes the internet as a source of knowledge to make business decisions: Analysis of the competition, entry into new markets, search for relationships, comparison of prices in stores, detection of changes in websites ...

datos-financieros-altern2.png
Alternative financial data

Make informed decisions with alternative financial data gleaned from the internet.

business-1.png
Lead Generation

Create lists of potential company clients with web features: ecommerce listings, companies with LinkedIn on their site, companies that use PayPal as a payment gateway.

automatizacion2.png
Process automation

Automate your reports, relate internal data to other external sources, including the internet.

DO YOU WANT TO START CREATING YOUR OWN CUSTOMISED
DATABASE WITH DATA FACTORY?

Cases of data sets created with Data Factory

Tailor-made points of interest for map applications and services

The main American directory mapping companies in the international market were looking to improve the accuracy and improve the qualification of locations of interest in Spain. In this case, through the quality of our street map and portal, as a local partner, the X,Y coordinates were provided to improve the accuracy of points of interest such as airports, shops or companies. From here we also generated new points of interest by combining multiple data sources, such as hiking trails, churches, pharmacies, etc…

Energy Efficiency Certificate (EEC)

There is a growing awareness among clients, banks and insurance companies regarding climate change and the correct fulfilment of their obligations in terms of corporate social responsibility. As a result, one client felt the need to have Energy Efficiency Certificates (EEC) for its property portfolio or for the properties associated with its services. Currently, less than 20% of the entire property stock in Spain has an official EEC.

Deyde DataCentric has developed a system that makes it possible to extract the actual labels from the different sources that publish them. And for the properties without this official certification, mathematical models were developed which, fed by data from real certifications and the witnesses of the corresponding appraisals, made it possible to estimate the letters and numbers of emissions and consumption. Exclusively created data that our client incorporated into the records of its property portfolio.

fachada de bloque de pisos con un cartel indicando los niveles de los certificados energéticos

Environmental risks

The increasing occurrence of extreme natural phenomena as a result of climate change created the need for an insurance client to control the risk of its insured assets as much as possible.

To this end, a series of cartographic layers were generated with information on the existence of natural risks for the entire national territory, which were incorporated at the registry level. Three different layers were obtained corresponding to:

In each of these layers – in addition to the indicators associated with each type of risk – several additional indicators were constructed: the frequency indicator, which provides information on the probability of the corresponding event occurring, and the magnitude indicator, which provides information on the expected damage in the event of the event occurring.

Digital Maturity Indicator

Together with a client of ours that sells technological products, we came to the conclusion that digital maturity could be an important variable when segmenting their database to target commercial campaigns. The digital maturity of a company is not a data that exists as such in any source of information, so we set out to create it.

In this case, using web crawling and web scraping techniques, we started with the digital footprint of the companies, which corresponds to all the information that can be obtained from their domains and web pages.

After securely associating a company with its domains, at Deyde DataCentric we apply a series of processes based on NER (Name Entity Recognition) and NLP (Natural Language Processing) to extract the information from this raw data.

Through different indicators that we extracted from this digital footprint, we have created a Digital Maturity Indicator of companies and their evolution over time.

manos en ordenador y mouse

Reconstruction value of a house

A client in the banking sector needed to create a data to estimate the value of real estate, as reliably as possible, when a mortgage is foreclosed.

So we created the reconstruction value of a property. The value is obtained by multiplying the square metres of constructed area by the average reconstruction value of a house with the same characteristics. This means that it is not just a question of square metres, but also the type of property, the predominant building materials and the geographical area.

Another client in the insurance sector already uses it to improve the calculation of household premiums.

Cases of data sets created with Data Factory

Tailored POIs for map apps and services

The main American companies of directorial maps in the international market sought to improve the precision and improve the qualification of locations of interest in Spain. In this case, through the quality of our street map and portal, as a local partner, the X,Y coordinates were provided to improve the precision of points of interest such as airports, shops or companies. From here, new points of interest were also generated by combining multiple data sources, some such as hiking trails, churches, on-call pharmacies, etc...

facha de bloque de pisos con un cartel indicando los niveles de los certificados energéticos

Energy Efficiency Certificate (EEC)

There is a growing awareness of customers, banking entities and insurance companies regarding climate change and the correct fulfillment of their obligations in terms of corporate social responsibility. As a consequence, a client had the need to imminently dispose of the Energy Efficiency Certificates (CEE) of his property portfolio or of the properties associated with his services. Currently less than 20% of the entire property park in Spain has an official EWC. At Deyde DataCentric, a system was developed that allows extracting the real labels from the different sources that publish them. And for the properties without this official certification, mathematical models were developed that, fed by data from real certifications and from witnesses of the corresponding appraisals, made it possible to estimate the letters and numbers of emissions and consumption. Exclusively created data that our client incorporated into his property portfolio records.

Environmental risks

The increasing appearance of extreme natural phenomena as a consequence of climate change gave an insurance client the need to control the risk of their insured assets as much as possible. For this, a series of cartographic layers were generated with information on the existence of natural risks for the entire national territory, which were incorporated at the registry level. 3 different layers corresponding to:

In each of these layers -in addition to the indicators associated with each type of risk- several additional indicators were built: the frequency indicator, which provides information on the probability of the corresponding event occurring, and the magnitude indicator, which informs about the expected damage in case it happens.

manos en ordenador y ratón

Digital Maturity Indicator

Together with a client of ours who sells technological products, we came to the conclusion that digital maturity could be an important variable when it comes to segmenting your database to direct commercial campaigns. The digital maturity of a company is not a piece of data that exists as such in any information source, so we set out to create it. In this case, with web crawling and web scraping techniques, we start from the digital footprint of the companies, which corresponds to all the information that can be obtained from their domains and web pages. After safely associating a company with its domains, at Deyde DataCentric we apply a series of processes based on NER (Name Entity Recognition) and NLP (Natural Language Processing) to extract information from this raw data. Through different indicators that we extracted from this digital footprint, we have created a Digital Maturity Indicator for companies and their evolution over time.

Reconstruction value of a home

A client in the banking sector needed to create data to estimate the value of real estate, as reliable as possible, when a mortgage is foreclosed This is how we create the reconstruction value of a property. The value is obtained by multiplying the square meters of constructed area by the average reconstruction value of a home with the same characteristics. That is to say that it is not only about square meters, but the type of housing, predominant construction materials and geographical area influence. Another client in the insurance sector already uses it to improve the calculation of household premiums.

Cases of data sets created with Data Factory

Tailored POIs for map apps and services

The main American companies of directorial maps in the international market sought to improve the precision and improve the qualification of locations of interest in Spain. In this case, through the quality of our street map and portal, as a local partner, the X,Y coordinates were provided to improve the precision of points of interest such as airports, shops or companies. From here, new points of interest were also generated by combining multiple data sources, some such as hiking trails, churches, on-call pharmacies, etc...

facha de bloque de pisos con un cartel indicando los niveles de los certificados energéticos

Energy Efficiency Certificate (EEC)

There is a growing awareness of customers, banking entities and insurance companies regarding climate change and the correct fulfillment of their obligations in terms of corporate social responsibility. As a consequence, a client had the need to imminently dispose of the Energy Efficiency Certificates (CEE) of his property portfolio or of the properties associated with his services. Currently less than 20% of the entire property park in Spain has an official EWC. At Deyde DataCentric, a system was developed that allows extracting the real labels from the different sources that publish them. And for the properties without this official certification, mathematical models were developed that, fed by data from real certifications and from witnesses of the corresponding appraisals, made it possible to estimate the letters and numbers of emissions and consumption. Exclusively created data that our client incorporated into his property portfolio records.

Environmental risks

The increasing appearance of extreme natural phenomena as a consequence of climate change gave an insurance client the need to control the risk of their insured assets as much as possible. For this, a series of cartographic layers were generated with information on the existence of natural risks for the entire national territory, which were incorporated at the registry level. 3 different layers corresponding to:

In each of these layers -in addition to the indicators associated with each type of risk- several additional indicators were built: the frequency indicator, which provides information on the probability of the corresponding event occurring, and the magnitude indicator, which informs about the expected damage in case it happens.

manos en ordenador y ratón

Digital Maturity Indicator

Together with a client of ours who sells technological products, we came to the conclusion that digital maturity could be an important variable when it comes to segmenting your database to direct commercial campaigns. The digital maturity of a company is not a piece of data that exists as such in any information source, so we set out to create it. In this case, with web crawling and web scraping techniques, we start from the digital footprint of the companies, which corresponds to all the information that can be obtained from their domains and web pages. After safely associating a company with its domains, at Deyde DataCentric we apply a series of processes based on NER (Name Entity Recognition) and NLP (Natural Language Processing) to extract information from this raw data. Through different indicators that we extracted from this digital footprint, we have created a Digital Maturity Indicator for companies and their evolution over time.

Reconstruction value of a home

A client in the banking sector needed to create data to estimate the value of real estate, as reliable as possible, when a mortgage is foreclosed This is how we create the reconstruction value of a property. The value is obtained by multiplying the square meters of constructed area by the average reconstruction value of a home with the same characteristics. That is to say that it is not only about square meters, but the type of housing, predominant construction materials and geographical area influence. Another client in the insurance sector already uses it to improve the calculation of household premiums.

Pyramid, add value to your information with unique modeled data that does not exist in any other data source

Create new insights by relating your data to our data source called Pyramid. With Pyramid you will have several integrated information sources in a single interface, from exclusive data from Deyde DataCentric that does not exist in any other information source to already preprocessed and validated open data ready to consume. We have data from the business fabric such as turnover or number of employees, more than 3,200 consumer data and environmental indicators associated with a geographical point and real estate data from Spain and Portugal.

A flexible and adaptable solution to a multitude of problems that contains the largest compendium of data, 100% compliance with GDPR and consumable in real time through web services.

Create new insights by relating your data to our data source called Pyramid. With Pyramid you will have several integrated information sources in a single interface, from exclusive data from Deyde DataCentric that does not exist in any other information source to already preprocessed and validated open data ready to consume. We have data from the business fabric such as turnover or number of employees, more than 3,200 consumer data and environmental indicators associated with a geographical point and real estate data from Spain and Portugal.

A flexible and adaptable solution to a multitude of problems that contains the largest compendium of data, 100% compliance with GDPR and consumable in real time through web services.

Are you still missing data? Use the internet as a data source

We use the internet as a data source and extract the information you need tailored for your projects with processes similar to those used by search engines to scan and index web pages.

We use the internet as a data source and extract the information you need tailored for your projects with processes similar to those used by search engines to scan and index web pages.

Frequently asked questions about Data Factory datasets

Preguntas frecuentes sobre los datasets de Data Factory

Creation of variables from scratch to explain a reality of your business and to be able to add and relate that information to your data models. We have created variables such as the degree of digital maturity…

Create customised databases for your business through multiple sources such as open data, internet and your own data.

Contact

Contact

If you need any information or advice regarding our Deyde DataCentric services.