Open Data for the Government of Aragon (2017)

General project data

  • Description: ORDER IIU/776/2017, of May 25, by which the Instituto Tecnológico de Aragón is entrusted with the activities in 2017 related to the opening of the data of the Government of Aragon.
  • Bulletin No.: 111.
  • Issuing body: Department of Innovation, Research and University.
  • Date of award: 25/05/2017.
  • Date of publication: 13/06/2017.
  • Execution dates: 06/13/2017 12/31/2017.

Presentation and objectives

With the objectives of creating economic value in the ICT sector through of the reuse of public information, increase the transparency of public in the Administration, promote innovation, improve the systems for information and generate data interoperability. between public sector web sites, are attributed to the Department of Innovation, Research and University, the competences of elaboration and project and program management for the design and coordination of the data openness in the Government of Aragon and its implementation in collaboration with the different Departments and agencies of the Administration, as well as the dissemination of such data through the of the open data portal of the Government of Aragon(

Aragon Open Data initiated the project to open public data by Agreement of July 17, 2012 of the Government of Aragon, and on February 6, 2013 the portal was presented. Throughout this time, numerous works have been carried out that have new data and information available to the company’s customers. third parties (citizens, companies, etc.).

As of today, the complex of the autonomous public administration in the area of data and information generation is reflected in the proliferation of a large number of websites, subdomains and portals under or not related to this domain, circumstances that hinder the access and use of the information by users and by the users themselves. services of the Government of Aragon.

For this reason, given the number of of current websites, domains and portals of the Government of Aragon and by virtue of the of the competencies of improving the information systems of the Administration; to generate interoperability of data between the websites of the and the adoption of technical standards in the public sector and the adoption of technical standards on society, and in particular, those related to the information society, and in particular, those related to the interoperability, it is considered necessary for all information to be institutional and related to the existing autonomous administration on the web, can be compiled to be offered from a single point, Regardless of the domain, structure, or possibilities of the different current portals.

Based on this approach, and from the competence of data opening in the Government of Aragon, arises from the within the Directorate General for Electronic Administration and Society the need to retrieve all information on the information institutional information offered on the web so that it can be exploited, analyzed, and and reused, and to be used by third parties (other institutional websites), media, developers, or citizens) in a manner that is structured and controlled, with Aragón Open Data being the access point to the for this purpose.

It is also intended that the information obtained to be able to verify and, if necessary, to enrich, by means of real cases and the practical operation of the Interoperable Information Scheme of the Aragón (EI2A), developed through the entrustment of the Instituto Tecnológico de Aragón to carry out activities in 2016. regarding the data openness project of the Government of Aragon, formalized by Order IIU/461/2016, of May 9, 2016, and whose The result of this work was, among others, the Interoperable Information Scheme of Aragon (EI2A).

With this approach would make it possible to apply the EI2A on processes and actual data. These data are intended to be related to the Scheme. come from applying web crawling, spidering or spidering techniques on the existing domains of the Government of Aragón on the web, technical to track, capture and store the information and data on the different institutional web pages and portals of the regional administration. This tracking and capture process involves to provide the appropriate parameters in tools related to the web crawlers (web crawlers) so that in accordance with specific index and recognize your content, capturing that content that is information about it, for example its title, texts, and location on the web (URL).

All this information and data captured will be stored for later exploitation according to EI2A, thus converting the institutional information, data and content and captured that is found on the web in a dispersed manner, not in the homogeneous, uncontrolled and non-exploitable structured data, analyzable as a whole and served under Aragón Open Data, at the disposal of of third parties, applications, services and citizens.

These tasks require identify, study and analyze current trends, technological development necessary and the processes to be executed on the portals and web domains. of the Government of Aragon, and all of this protected by a legal framework that allows the use of web crawling techniques, as well as an important knowledge of interoperability and semantic ontologies.. Tasks, which, following in part the line related to the development of the EI2A and complementing the activities related to the project for the opening of the data from the Government of Aragon, have been entrusted to the Instituto Tecnológico de Aragón (ITAINNOVA).

Entities entrusted with the performance of tasks

Project results

The actions of ITAINNOVA in this project focus on

  • Coordination, management, planning and direction of the work of the order throughout the entire duration of the order. development, to ensure that the execution and delivery of results are within the predefined time frame and within the budgets. to ensure the quality of the work and documentation. delivered and to coordinate cooperation among team members.
  • Elaboration of a methodology to be followed for the extraction of information from the designated websites in accordance with the technical-legal framework. applicable to the use of web crawling techniques, tracking to the information extraction actions and the preparation of a report that will will consider the possibility of publishing the information obtained as data and the conditions under which it may be reused.
  • Preparation of a technology watch report on web crawling technologies, software and services. that allow for the retrieval of institutional information offered in websites, domains and portals of the Government of Aragon.
  • Elaboration of documents that gather the requirements and the design of the system architecture to be developed, with the objective of capturing information provided by websites, subdomains and portals of the Government of Aragon and structure it in accordance to the semantic model of the Interoperable Information Schema of Aragon. (EI2A) extended to facilitate the standardization of the information and facilitate their access and reuse.
  • Development of the semantic solution or system according to the Technology Watch study, requirements and architecture design. The semantic solution takes care of the commissioning of web crawling software and servicesof the processing of the unstructured textual information captured, from the categorization of web sites, the application of data mining techniques and text processing for the extraction of concepts, and of the storage of the information processed through technologies/bases Big Data that allow the processing of large amounts of information in a dynamic and scalable way.
  • Adaptation and improvement of the semantic model Interoperable Information Scheme of Aragon (EI2A). to structure in a homogeneous manner basic data collected from the webs, subdomains and portals of the Government of Aragon previously selected and define relationships between them, with the purpose of standardize information, automate its access and reuse.
  • Development of the Technical Catalog of Standards used in the EI2A to ensure that their use has an impact on improving the government’s ability to Aragon to cooperate with other administrations and with the public, facilitating the exercise of the right of access to public information and the right to the socioeconomic development of Aragon.
  • Analysis of the information extracted through the selected software and web crawling services.
  • Analysis of how to integrate and publish data through the Aragón Open Data API.
  • Testing to verify and validate that all defined and developed functionalities meet the needs of the Government of Aragon.
  • Deployment system and Big Data infrastructure (software and/or software services). web crawling, NoSQL databases, Moriarty framework, Big Data clustering, NoSQL databases, Moriarty framework, Big Data clustering. Data processing and storage of information through the use of Spark technology, etc.) on ITAINNOVA servers.
  • Transfer of the system through the preparation of a data plan report. where the way in which the collected data is to be transferred is specified, and manage, as well as the requirements of the machines in production.
  • Dissemination of the system through the two days of dissemination in the environment of the Government of the Dominican Republic Aragón to publicize the work carried out.

Throughout During this time, a great deal of work has been carried out to achieve the automation in the publication of the information to ensure that third parties can reuse it in the best way. Given the volume of data that is starting to exist, within the line of work of the automation in information management, are beginning to have a significant impact on the special relevance to all those elements that help to improve the quality of the of the structuring of the information and the standardization of the data contained in the databases.


  • Budget of ITAINNOVA: 81,757.00 .

For the performance of the tasks entrusted, the Department of Innovation, Research and University will allocate to ITAINNOVA the amount of 81,757 (eighty-one thousand seven hundred and fifty-seven), which will be charged to the budgetary applications:

  • 17040 G/5424/609000/91001 (WBS 2012/000354) in the amount of 40,878.50 .
  • 17040 G/5424/609000/14201 (WBS 2012/000354) in the amount of 40,878.50 .

of the Expenditure Budget of the Autonomous Community of Aragón for the 2017 fiscal year.

This action is eligible for funding under the ERDF Operational Program 2014-2020, in the priority axis 2 of Improving the use and quality of ICT and access to ICT.

Skip to content