Up
Logo Ministerio y UE
Unión Europea
Unión Europea

DATA ENRICHMENT AND METHODS OF ANALYSIS

 

The Data Enrichment and Analysis Methods (DEAM) project aims to enrich, take advantage of and exploit the data produced by the SUE Information Ontological Infrastructure (IOI-SUE) and the SUE Semantic Data Architecture (ASD-SUE). Both the IOI-SUE and the ASD-SUE facilitate the management and publication of semantic data; both projects are deeply related.

This subproject focuses on using such data and ensuring the sustainability of the infrastructure as follows:

  • Lot 1, Data Enrichment, is intended to facilitate data enrichment through the use of information sources available on the Internet and that are commonly used by the research community. In addition, they are expected to develop mechanisms for researchers to actively participate in the validation and updating of their information. Manage the information enriched through the concept of Research Object (RO), in such a way that it makes it easier for researchers to manage their RO, without this meaning generating a specific infrastructure for each type of RO.
  • Lot 2, Analysis Methods, aims to allow the exploitation and analysis of data for inclusion and participation purposes at various levels of the agents involved. For example, citizen science, dissemination for non-experts, applications that facilitate the formation of networks of experts and projects, applications that allow searching and identifying experts, applications that allow universities and society in general to know the research capacity in specific topics, both at the level of human resources and scientific-technical production.

In addition, through the connection APIs that will be developed, communication will be generated between the software components related to the Research Management System and other functionalities belonging to the Hercules project. Therefore, they will be an interface that allows the University's Research Management applications to communicate with the available databases.

 

Enrichment refers to the ability of HERCULES to make use of data that is served by applications external to HERCULES. This enrichment will be completely configurable and ideally oriented to consume the data from the API (s) that are considered useful to add value to the HERCULES data cloud.

Central Services (HERCULES Core Services)
This module will provide a series of services that are described below, and that will be relevant for other modules to be developed in Lot 1 and in Lot 2. 

Single Sign-On Module (SSO) 
HERCULES must provide programmatic methods that allow registered users to access multiple HERCULES applications with a single username and password. This module should not be confused with the Single Login Entry Point, described below.  

Single Login Entry Point (SLEP) module 
HERCULES must provide the user with a single registration through which it is possible to access multiple web applications using only the HERCULES identification system.

For example, a HERCULES user is authenticated. Through the web interface, the user proceeds to authorize the HERCULES platform to access its GitHub , FigShare , SlideShare , LinkedIn , SHARE, etc. This authorization will be saved and managed by HERCULES so that the user does not have to carry out this authentication and authorization process each time he enters HERCULES. In cases where applications have security restrictions that depend on the type of data, this must also be managed through the HERCULES interface.

HERCULES will thus create an identity record for the user. This way, HERCULES will know that user X is YYYY on ORCID and KKKK on GitHub. In addition, HERCULES will have the corresponding authorizations for applications using the HERCULES SSO system to inherit this ability to manage multiple identities and thus access multiple applications.

The facility of being able to authenticate and authorize HERCULES on multiple platforms should be extensible. The successful bidder must start with GitHub , FigShare , SlideShare , LinkedIn , SHARE, data.world , Dataverse, ORCID and SHARE, in addition to the authentication services of the national universities that are members of CRUE. You should also ensure the extensibility of this component in order to add more applications as necessary. 

User administration module 
A module must be developed for HERCULES that allows users to create, edit, delete and suspend. This module should also define the roles and permissions for the different types of HERCULES users. The definition of user profiles must be coordinated with the SSO module. This module will also manage the security associated with each user profile.  

Management of FAIR Research Objects (HERCULES FAIR RO) 

RO Management Module
Researchers have their Objetos de Investigación (hereinafter abbreviated as Research Objects, RO) in various repositories, for example, FigShare, SlideShare, Git, SHARE, SGI HERCULES, etc. This implies that to the diversity of RO it will be necessary to add the own complexity of managing said RO over various applications. Due to the specialty of each application to handle certain specific types of RO, for example GitHub for code, this situation will not change. Therefore, HERCULES is required to ideally allow the researcher to declare and catalog their ORs regardless of the application they are in. In order to handle the situation described above, HERCULES will facilitate the researcher, via APIs and available OAUTH methods, to “claim” their RO for each application.  

Once the user has claimed their ROs, HERCULES will register them in their name, saving all associated metadata and also establishing the traceability of provenance. Furthermore, a metadata model should be investigated and developed that allows the user to describe the RO in compliance with FAIR principles. The management of RO also implies establishing relationships between RO that come from different applications. This is how, for example, the user should be able to establish the relationship between a preprint in Share with a dataset in FigShare and with a code in GitHub. In this way the resulting RO will be an aggregation of the individual ROs. The composition and enrichment of the RO is expected to take place in a graphical interface that allows direct manipulation of the ROs. Examples of this type of functionality can be seen at https://rmap-hub.org.  

This module should also suggest to the researcher the inclusion of OR of those authorized sites that have not been included and that are related to the researcher. In this way, if an investigator authorizes HERCULES on FigShare, “claims” in FigShare only one RO and fails to claim the other ROs in his name, then he will receive suggestions that will indicate that he has pending to claim other ROs in his name in FigShare.   

Similarly, this module is expected to also build RO suggestions belonging to other HERCULES researchers who have carried out the authorization process. This module, once authorized, will have knowledge of the ROs on behalf of the researcher in the authorized application. This will allow HERCULES to profile such ROs by assigning them research topics. For example, for a poster on FigShare, this module will be able to extract the topics from the poster and thus determine which HERCULES user might be of interest. Once this calculation has been made, it will proceed to carry out the corresponding suggestion. A similar process is expected to profile the RO on the other sites against which the authorization is carried out; For example, a repository on GitHub can be profiled and research topics assigned to this profile, for example Python as a programming language. The recommendation may then be for a HERCULES user who is also a Python programmer.  

All automatic topic extraction, and topics associated with an RO, must also be available via an API. This API should facilitate the consumption of this data for the purpose of, for example, adding the topics.  

Below, in a non-exhaustive way, some of the possible repositories that could be considered for this purpose are listed:

Research Objects Integrated Management Module 

ROs are aggregations of several ROs as well as being the main elements of said aggregation. This is how you can have a single element RO. There are ROs of very different types and the handling of each specific type of RO in terms of registration, enrichment (manual and automatic), export, import, establishment of automatic relationships, etc. it is not entirely generic. There are nuances of each RO.  

Some specific types of RO on which HERCULES, seen as an Open Science platform, is interested are described in the next sections: code, bibliographic references, manual annotations and experimental protocols. Innovative proposals are expected for the integrated management of a varied range of OR. It is important to note that an RO can have value on its own; aggregation adds value to the whole and also creates value for those who have participated in its definition and creation.  

Code module

A specific type of RO is the code and all its associated processes, for example versioning, documentation, unit tests, etc. This RO has a complexity of the code itself. Proponents are expected to provide practical solutions that allow GitHub  and BitBucket repositories to be extracted and annotated. Once the repository has been registered in the name of the HERCULES researcher related to said repository, it is expected that the module can disambiguate, or allow the user to do so, the type of relationship it has with the repository. For example, is it a repository that the researcher created de novo or is it a fork of another repository? If it is a clone of another repository, is there the researcher's own work that differentiates the repository from the original?  

The code generated by a researcher is part of their assets, and as such it should be possible to quantify it within the whole of their academic production. This module should allow such quantification, contribution, original work, derivative, etc. in relation to the code in Git-like system repositories. It is intended to give value to the code generated by a researcher; in the same way, a metadata model must be investigated and developed that allow the description of said code and quantify the code as part of the academic production of a researcher. Documentation, examples of use, metadata suggested by the author, automatic topics assigned by HERCULES to said code, are just some aspects that should allow characterizing and quantifying said OR.  

Bibliographic references module 

Bibliographic references are a set of specific annotations on the text that are referenced against an external document. Viewed as RO, its representation is estimated to be the simplest, and from them various types of relationships with the text can be established depending on the type of citation. For example, a URL, a published document, or a personal communication. The bibliographic references module must resolve the references in the papers published by HERCULES researchers, capture the metadata, enrich the metadata of the referenced publications with topics and put together a conceptual map for the researcher where their research career can be appreciated, which cites, the relationship between the topics of his papers and those he cites. This module is expected to make extensive use of resources such as Open Citation , Citeseerx , CrossRefEuropean PMC, etc.

Manual annotations module 

There are several types of annotations, depending on the purpose. For example:

  1. With the purpose of generating a reference between a specific element within the paper or document to a file external to the document, for example, a figure or code.  
  2. For the purpose of expanding or explaining a narrative element within the text; there is no expectation of generating a discussion about this type of annotation. Refers to annotations made by the authors as a post-publication. For example, those related to more recent literature related to a part of the text.  
  3. In order to generate a discussion about a narrative element within the document, it is hoped that others will be able to participate in this annotation. This type of annotations can, for example, refer to parts of the text that generate controversy. They can also be those annotations in which any reader questions a specific part of the text.  

Proposals are expected that allow the annotation of a document, and that cover the cases previously presented. Annotations are further expected to be treated as RO; therefore, they must follow the corresponding standards, for example, those of an RO and also those proposed by the Open Annotation Framework . The entire annotation system must have its API, document export and ingestion system, must be coordinated with the automatic annotator (topic identifier) ​​described in this document, and it is expected that proponents reuse existing systems such as Dokieli or Hypothes.is.

Experimental Protocol Registration Module

Experimental protocols describe the workflow to carry out when conducting an experiment. These flows are of very different types; the best known being those relating to life sciences, experimental protocols. Experimental protocols are not only used in the bio domain and in medicine. All work areas have protocols of some kind. This module is not intended to standardize the way to report research workflows. Instead, it simply aims to make any type of experimental protocol a FAIR first-order RO.

To achieve this, it is necessary to standardize the metadata for experimental protocols in certain domains. What do they have in common, what is the lowest common denominator? And with these metadata they are registered as Ros, allowing later the management of an RO in HERCULES; that is, the establishment of relationships with other ROs. For example, the Sample, Instrument, Reagent, Objective model (SIRO: Giraldo 2017, Giraldo 2018) covers a wide variety of protocols in the biological domain. In the same way, there are other minimum information standards that can be used to support the registration of other types of experimental protocols. It is expected that it will be determined how to register a great variety of experimental protocols in order to FAIRify them and also have explicit the relationship between the experimental protocol and the results in addition to the related publications.

This repository must be easy to use, facilitate access to data via API, and also be properly documented. It is also expected to be in communication with the topic extraction module and all the machinery for RO management. 

RO recovery module

ROs represent those research products that are created, managed, distributed, shared, and modified in the course of an investigation. As such, ORs are part of the results or outcomes of an investigation. The value of the ROs is concentrated in the published document; however, this value does not reflect the production chain established in an investigation. Nor does it reflect contributions or the value created in the research. For practical purposes, the value of any RO that is not a publication is lost. This inequity generates a loss in value for the research system; it is an asset on which no value is established and therefore is discarded. This is directly detrimental to the interest of the researcher in being recognized for their products and in addition to the research system for not being able to account for this type of values.

Innovative solutions are sought here that allow establishing a value for all the ROs produced in an investigation. It is expected to be able to have a registry that considers the generality of the OR but is also capable of recognizing those specific nuances of certain ORs, for example, research protocols, clinical protocols, code, etc. It is expected that the proposals in this regard will also allow the generation of an aggregation currency that allows for each OR in possession of a researcher, the power to disaggregate and thus determine the origin of said OR. For example, in what research it was produced, how it has been reused, who has reused it, modifications, etc. For this scenario, it would be of interest to assess innovative solutions based on Blockchain or Interplanetary File System (IPFS)

Research Objects Processing and Analysis (HERCULES RO Enrichment) 

Topic extraction module 

Researchers registered with HERCULES have a wide variety of publications. These are described with metadata that each journal provides. However, this description is indirect in that it is not directly derived from the content. In order to improve the experience topics derived from a researcher's publications, this module will extract these topics directly from the publications. Once these are extracted, they will be part of the topics that describe the researcher's experience. Each topic extracted in this way must have its own metadata, among others indicating the origin (for example, DOI of the publication from which it was extracted, and the extraction method applied).

The automatic extraction of research topics is not exact. In view of the above, in all cases the researcher must be able to correct the topic of experience suggested by the algorithm. Traceability should be had for these corrections and the algorithm will learn from experience. Similarly, it is expected that the HERCULES Dashboard (to be described later) will implement a functionality that allows other HERCULES researchers to suggest experience topics for their colleagues. An example of the expected generic type of workflow is given below:

  • The user configures the service with which he wants to enrich the data. Ideally, the user could have an API list configured by default, in order to select only the one with which to enrich.
  • The system automatically presents to the user those concepts or properties with which a match can be made between the knowledge graph and the API that you configured in the previous step. Additionally, the user will have the option of selecting those concepts or properties that they are interested in enriching. These results will be ordered according to the similarity they have with the original resource.
  • Once the process is executed, the results are displayed on the screen. The user decides whether or not to accept the enrichment from among all the available proposals and executes the enrichment. 

HERCULES will allow the most common Natural Language Processing tasks to be carried out in order to extract research topics from publications and ROs. For this, the successful bidder will have to implement / reuse tools that allow the complete analysis of the text of the publications or of the ROs based on the automatic extraction of terms. It is of special interest for HERCULES the reuse of developments financed by the Plan de Impulso de las Tecnologías del Lenguaje  Technologies of the Government of Spain, as well as being aligned with international initiatives related to the European Open Science Cloud.

The successful tenderer will present a proposal on how it will approach this module in accordance with the criteria and tools that it considers necessary for its implementation. As a starting point, the successful tenderer will take into account that most methods for automatic term extraction perform processing in two phases: linguistic processing and statistical processing, therefore, they will have to specify how they will approach these phases. For linguistic processing, you will have to specify which method you will use to, for example, label parts of sentences, filter the words used as stop words, and so on. For statistical processing, you will need to specify which automatic term extraction method to use and the justification for your choice, eg C-value / NC-value (Frantzi 1999). An important aspect that the successful tenderer will also have to consider is that the module will support the extraction of terms in English and Spanish.

Semantic similarity module

Textual semantic similarity aims to determine when the meaning of two texts is similar. This concept differs from finding the degree of textual similarity, relative to measuring the number of lexical components that both texts share. In the context of HERCULES, proponents are expected to come up with a solution that allows the end user to see semantic similarity between texts.

This is how, for example, it is expected that, for a publication, HERCULES indicates those publications that are similar in content, clearly communicating to the user how and why the documents found by HERCULES are similar to one given by the user, that is, explain the similarities.

In the same way, it is expected that this HERCULES module will be able to obtain the similarity between RO. For example, if the user has an RO of type code, Python 3.1, with input data of type genomic sequences, and also terms such as GWAS (Genome-Wide Association Studies) and Oryza Sativa are in the description of this RO then it is expected that within the ROs known to HERCULES are those that are similar and are suggested to the user as similar ROs. 

Investigator Profile (HERCULES Researcher Dashboard)

Researcher Dashboard module 

HERCULES will allow the user to be able to manage their profile data. Therefore, this module will allow the user to authorize the set of HERCULES applications in other applications to indicate where it is and what its scientific-technical production is. For example, using APIs and OAUTH the user must be able to authorize HERCULES to list those items in his name on FigShare. This enrichment of your profile from data on the web should generate a record on the HERCULES databases. Once the user's production on different web applications has been captured, these objects will be enriched with descriptions from the controlled vocabularies used by HERCULES. In addition, the user must have visualizations that allow him to understand his trajectory graphically; These visualizations will be built against the supplied data and also against those captured from third party platforms. This module will also retrieve the information of a researcher in any node of the HERCULES network.

This module must take into account the functionality developed for the management of the researcher's CV in the SGI HERCULES document (file E-CON-2019-53), sharing the modeling and interface aspects that ensure optimal development and interoperability of both. This module will have as an important source of information the data from SGI HERCULES published semantically, but it may show information retrieved from other sources that is not available, or shared by SGI HERCULES. Information updates made by the researcher using the HERCULES Researcher Dashboard will be notified to the SGI of the researcher's institution. This module will also notify the investigator's institution of the new information available, from other sources, not offered by SGI HERCULES. This module will offer the complementary edition view to the Research Web Portal included in the SGI HERCULES specifications, so the successful bidders must collaborate to obtain an integrated result as well as optimized for their types of users.

The user who will interact directly with the module included in this specification will be the Researcher. Among the needs that this module will satisfy are: 

  • Enter / edit the data of my CV, the model must follow at least the structure of the CVN of FECYT. If possible, HERCULES should be able to upload data via API directly from external sources and from SGI HERCULES. It will allow the data to be communicated to SGI HERCULES, to update the CV of the researcher at his institution. Expected data items include, but are not limited to: 
  1. Personal data (name, surname, NIF / NIE, email, etc.)
  2. Investigator identification data (ORCID, Google Scholar, etc.)
  3. Academic training (degree, master, doctorate, etc.) in which the details of the degree obtained, the date, the training center, etc. are indicated.
  4. Complementary training, for example, in foreign languages. For each training, the date, the name of the certificate obtained, etc. will be included.
  5. Knowledge area / discipline.
  6. Work history in which the start / end date, description of the activities carried out, name of the position, modality, employer entity, etc. are indicated.
  7. Projects in which I have participated including the roles I have played.
  8. Scientific production that has generated, for example, articles, software, datasets, etc. For each scientific result, the title, publication date, DOI, associated reference, etc. will be included.
  9. Patents, industrial designs, etc. that you have registered either as an owner or as a joint owner.
  10. Awards received in which the name and year of obtaining are collected. For example, one of the Premios Nacionales de Investigación.
  11. Participation in scientific dissemination events, for example, congresses, workshops, science night, etc. Including my participation role.
  12. or etc
  • Obtain the list of the works that I have directed / co-directed whether they are undergraduate (FDP), master, or doctoral thesis.
  • Obtain the list of congresses / workshops and events of scientific dissemination in which it has participated, indicating the role that I have had: organizer, exhibitor, etc.
  • Obtain the list of patents, industrial designs, etc. that has registered as owner or co-owner X or Y person, Z or K institution.
  • Obtain the list of projects in which I have participated including the role that I have played, for example, principal investigator.
  • Obtain the list of my scientific production.
  • Obtain the list of startup or spin-off that I have founded or of which I have been a partner.
  • Obtain the indicators of my scientific production, such as total citations, h-index, etc.
  • Visualize my trajectory according to the timeline and parameterizable according to criteria such as, for example, projects, supervised / co-supervised theses, etc.
  • Know if I am eligible to request an evaluation related to the new six-year knowledge transfer and innovation1 or one of the evaluations carried out by ANECA (NAQAA-National Agency for Quality Assessment and Accreditation)
  • Introduce technological offers aimed at companies, for which I will have to describe the offer, associate a maturity level (TRL- Technological Readiness Level) and associate evidence that supports the assigned maturity level.

Este lote incluye una serie de aplicaciones que permitirán la explotación y el análisis de los datos existentes en HERCULES.   

Búsqueda de investigadores -Módulo Research Synergy Finder 

HERCULES permitirá que un usuario realice búsquedas de investigadores y su producción científica con la finalidad de facilitar la detección de posibles nichos de colaboración y alianzas estratégicas. Una práctica frecuente en investigación es aquella relativa a conformar grupos de trabajo con el propósito de armar consorcios que participan en aplicaciones para dinero que apoya la investigación. Este tipo de asociaciones se llevan a cabo en muchos casos de manera subjetiva; los investigadores se asocian con aquellos que conocen, no necesariamente con los más indicados para conformar un grupo con el propósito de abordar un problema o aplicar para una subvención. HERCULES deberá facilitar a los usuarios detectar sinergias, encontrando investigadores que cumplan determinadas características con el propósito de, por ejemplo, ser más competitivos cuando se aplica a una ayuda. Los casos de uso a resolver por parte de este módulo incluyen recomendar potenciales socios para formar un consorcio con la finalidad de participar en la licitación de un proyecto, identificar expertos 
que sean capaces de evaluar propuestas de proyectos, etc. En cualquier caso, este módulo será capaz de presentar una interfaz en la que el usuario pueda filtrar, por áreas/disciplinas de conocimiento, aquellos centros/estructuras de investigación de las universidades españolas en las que se estén desarrollando actividades de investigación en un área/disciplina de interés.  

Entre las funcionalidades que como mínimo se espera sean cubiertas en este módulo se listan las siguientes: 

  • Como usuario requiero obtener un listado de los centros/estructuras de investigación que trabajan en un área/disciplina específica. 
  • Como usuario requiero obtener un listado de los investigadores de un centro/estructura de investigación de un área/disciplina específica. Este listado podrá filtrarse según el tipo de investigador ya sea docente, personal investigador en formación, etc. 
  • Como usuario requiero obtener el Top 10 (o el número que se considere relevante pues será parametrizable) de los investigadores de un centro/estructura de investigación ordenados por el número de citas, número de publicaciones, h-index, etc. en un área/disciplina específica. 
  • Como usuario requiero obtener el Top 10 (o el número que se considere relevante pues será parametrizable) de centros/estructuras de investigación que posean sellos de calidad asociados, por ejemplo: el sello Severo Ochoa
  • Como usuario requiero obtener un listado de los centros/estructuras de investigación que hayan realizado proyectos H2020 y/o proyectos del Plan Estatal
  • Como usuario requiero obtener un listado de la producción científica en un determinado rango de fechas de un centro/estructura de investigación en un área/disciplina. Para cada resultado se incluirán algunos metadatos importantes de la producción como, por ejemplo, DOI, año de publicación, etc. 
  • Como usuario requiero obtener una visualización en la que se recoja la distribución de la producción científica española, por ejemplo, de artículos publicados en revistas, según las comunidades autónomas en un rango de años. 
  • Como usuario requiero comparar comunidades autónomas, universidades, grupos de investigación, etc. en determinados tópicos para identificar cuál es el más competitivo y por qué. 
  • Como usuario requiero obtener un listado de patentes, diseños industriales, etc. de un centro/estructura de investigación en un área/disciplina. 
  • Como investigador y personal no investigador de la universidad requiero obtener un listado de los proyectos adjudicados/desarrollados, de un centro/estructura de investigación, de un área/disciplina, en un determinado año de búsqueda en los que se tenga acceso al detalle de al menos: 
    • Nombre del proyecto 
    • Palabras claves 
    • Tipo de participación: coordinador o participante
    • Tipo de proyecto: competitivo o no competitivo
    • Tipo de financiamiento: público o privado. 
    • Tipo de convocatoria: nacional, H2020, etc. 
    • Número y listado de personas involucradas en el proyecto 
    • Nombre(s) del investigador(s) principal 
    • Entregables/memoria del proyecto 
    • Producción científica relacionada con el proyecto 
    • Entidades colaboradoras/participantes 
    • Cuantía
    • etc. 
  • Como usuario académico no investigador necesito conocer el tamaño, experiencia y envejecimiento de un área de investigación a escala de universidad, regional, nacional. 
  • Como usuario necesito conocer el porcentaje de participación de un centro/estructura de investigación en proyectos nacionales o europeos. 

Desde el punto de vista de detección de sinergias, se desea responder a la necesidad de obtener la recomendación de un clúster de colaboración conformado por investigadores relacionados según la similitud entre artículos científicos de un área/disciplina de interés. El número de investigadores que conforman el clúster será parametrizable. En caso de que exista una carencia de investigadores para el área disciplina de interés, el módulo será capaz de sugerir investigadores de un área similar para poder al menos contar con aquellos profesionales que estén relacionados con la temática de interés. 

Este módulo permitirá que el usuario indique el (las) área/disciplina(s) de interés y el número de expertos (N) que necesita encontrar. Tras esta configuración básica, el usuario tendrá la opción de configurar los criterios adicionales para la selección. De forma no exhaustiva se mencionan a continuación algunos criterios relevantes que el adjudicatario tendrá en cuenta para la valoración de cada investigador: 

  • Número de proyectos nacionales, europeos, etc. en los que ha participado y su rol.  
  • Número de citas, h-index, etc. 
  • Número de patentes. 
  • Etc. 

Adicionalmente, este módulo permitirá la configuración del peso de cada uno de estos criterios con respecto a la valoración total, por ejemplo, número de citas 25%, número de proyectos europeos en los que ha participado 50%, rol de participación como investigador principal en proyectos europeos 25%, etc.  

En el caso de que no existan investigadores en el área/disciplina de interés requerido se hará uso de las características de inferencia de las ontologías desarrolladas en Hércules ASIO para sugerir aquellos investigadores del área más similar. Esta recomendación vendrá acompañada del detalle del área/disciplina a la que pertenece el investigador recomendado incluyendo el detalle del área/disciplina(s) que se ha tenido que recorrer dentro del grafo hasta llegar al resultado (haciendo referencia al ejemplo de ontologías del apartado Web Semántica de este documento, si Microbiología Clínica es una subárea de Microbiología y estamos buscando en nuestro sistema un experto en Microbiología Clínica y no disponemos de ninguno, el sistema podría recomendar un experto en Microbiología antes que uno en Genética, ya que son áreas más cercanas semánticamente. Por lo tanto, el detalle del camino que se añadiría al investigador recomendado sería Microbiología Clínica - Microbiología). Así también, se presentará la posición del investigador de entre los N investigadores recomendados según la calificación obtenida. Se espera que el adjudicatario proponga más criterios para realizar las recomendaciones (que podrán ser extraídos incluso del módulo de indicadores) y las fórmulas para calcular las calificaciones de cada investigador. 

Cuando haya finalizado la recomendación, el usuario tendrá acceso automático al CV completo de los investigadores sugeridos por la plataforma. 

Análisis de proyectos de investigación - Módulo de Gestión y Análisis de Proyectos 

HERCULES permitirá que el usuario sea capaz de analizar los resultados de los proyectos de investigación que se están llevando a cabo en un centro/estructura de investigación. Por lo tanto, el objetivo de este módulo es incrementar la transparencia y facilitar las labores de gestión y seguimiento de los proyectos que el usuario coordine o en los que participe un centro/estructura de investigación. La fuente primaria de información será la procedente del SGI HERCULES que haya sido compartida en formato semántico. El SGI HERCULES deberá poder acceder a los servicios de análisis a nivel de red HERCULES generados por este módulo. 

Este módulo cubrirá al menos el siguiente listado de necesidades: 

  • Como investigador, personal no investigador de la universidad requiero insertar/modificar los datos relacionados con los proyectos de investigación, incluyendo los entregables que se hayan generado en la fase de propuesta. El usuario tendrá acceso a esta información según el nivel de acceso que se le haya proporcionado previamente según su rol, según niveles de confidencialidad de ser el caso. Entre los datos que se proporcionarán por cada proyecto se tendrá al menos: 
    • Nombre del proyecto 
    • Palabras claves 
    • Tipo de participación de la entidad: coordinador o participante 
    • Tipo de proyecto: competitivo o no competitivo 
    • Tipo de financiamiento: público o privado
    • Tipo de convocatoria: nacional, H2020, etc. 
    • Número y listado de personas involucradas en el proyecto 
    • Nombre(s) del investigador(s) principal 
    • Entregables/memoria del proyecto 
    • Producción científica relacionada con el proyecto 
    • Entidades colaboradoras/participantes 
    • Cuantía
    • Etc. 
  • Como usuario necesito una visualización que me permita explorar la información de cada proyecto según los filtros que haya elegido, por ejemplo, por años, por tipo de convocatoria, por cuantía mayor a determinado valor, según un área/disciplina, según la ubicación geográfica, etc. 
  • Identificar proyectos con temática y objetivos científicos similares. En este caso, el usuario podrá acceder a visualizaciones comparativas de la información de proyectos similares. 
  • Como usuario necesito una visualización que me permita analizar la evolución de un investigador, conjunto de investigadores o líneas de investigación a través de los resultados de los proyectos realizados. Se podrá hacer una selección de los proyectos a incluir. El usuario podrá acceder a visualizaciones comparativas de la información de investigadores, conjuntos de investigadores o líneas de investigación seleccionados. 

Análisis de indicadores de investigación - Módulo de Catálogo de indicadores. 

Se proveerá al usuario de indicadores de investigación e innovación que le permitan medir la investigación nacional a partir de la información disponible en la red HERCULES y en fuentes externas.  Como mínimo este módulo será capaz de calcular los valores de al menos 20 indicadores (este número es una aproximación pues podrían considerarse más o menos indicadores) que tendrán que ser propuestos por el adjudicatario para que sean validados por la entidad contratante. Se tendrán en cuenta para ellos los indicadores que se definan como fundamentales en el pliego SGI HERCULES (expediente E-CON-2019-53). Se deberá poder incorporar cualquier indicador creado según el modelo a desarrollar en el pliego SGI HERCULES. 

Como punto de partida el adjudicatario se podrá referir a los indicadores usados a nivel nacional (por ejemplo, por el INE ) y a nivel internacional (por ejemplo, u-ranking ) que se consideren relevantes (según diferentes áreas por ejemplo enseñanza, investigación, citas, transferencia tecnológica, etc.). Algunos indicadores que también pueden servir de referencia son aquellos mencionados por el Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020

  • Número de publicaciones científicas en colaboración internacional. 
  • Porcentaje de publicaciones científicas en colaboración internacional. 
  • Porcentaje de inversión total en I+D ejecutado por organismos de investigación de la Administración Pública y por instituciones de enseñanza superior. 
  • etc. 

Otros ejemplos de indicadores son aquellos descritos en el informe European Innovation Score Board , por ejemplo:    

  • Gasto en R&D en el sector público (expresado en porcentaje del Producto Interno Bruto) 
  • Publicaciones científicas, de entre el top 10 de las más citadas a nivel mundial, expresadas como el porcentaje del total de las publicaciones científicas del país. 
  • etc. 

El adjudicatario tendrá que mencionar, en el caso de aquellos indicadores que requieran datos externos, cuáles son y cómo se obtendrán para realizar el cálculo. Algunas fuentes de datos externas son el INE , Eurostat , Web of Science , para calcular, por ejemplo, el valor del Producto Interno Bruto en España, el número de habitantes, etc. 

Son de interés también indicadores de colaboración público-privada y de financiación privada de la actividad de investigación y transferencia.  

Este módulo será capaz de cubrir al menos las siguientes necesidades: 

  • Como usuario necesito obtener el listado de indicadores con su respectivo valor y unidad de medida (porcentaje, número, etc.) calculados en un periodo de tiempo, ya sea para toda la universidad o para cada centro/estructura de investigación de cada universidad.
  • Como usuario necesito una visualización de la evolución de indicadores según la línea del tiempo (años, trimestres, etc.)
  • Como usuario necesito acceder a una predicción de la evolución de indicadores a partir de la serie temporal de sus valores y de las variables existentes en el sistema que se consideren relacionadas, lo cual se podrá parametrizar.
  • Como usuario necesito detectar tendencias en áreas y líneas investigación a partir de los datos disponibles en Hércules.
  • Como usuario necesito cuantificar la contribución de cada investigador, línea de investigación, área de conocimiento, centro de investigación, comunidad autónoma, etc. a cada indicador.
  • Como usuario tengo interés en hacer clusters de líneas y áreas de investigación, investigadores, grupos, centros de investigación, comunidades autónomas, etc. usando como criterio de clasificación uno o varios indicadores de productividad a elegir por el usuario, 
  • Como usuario investigador y gestor, estoy interesado en conocer qué líneas y áreas de investigación, investigadores, grupos, centros de investigación, comunidades autónomas, etc. presentan desviaciones significativas con respecto a la media de los indicadores de productividad científica.
  • Como decisor tengo interés en conocer el perfil y evolución de la relación de investigación y transferencia de una empresa con un conjunto de centros de investigación en un período de tiempo.  

The pre-commercial contracting will be developed in different elimination phases so that the effectiveness and efficiency of the solutions proposed by each of the contractors that compete with each other will be progressively verified to create innovative solutions that better respond to the Functional Requirements indicated in the Specifications.

LOT I

Fase1 EDMA

LOT II

Fase II

PHASE I, STUDY OF THE VIABILITY OF THE PROPOSED SOLUTION

STATUS: ended

During this phase, aimed at ensuring the highest degree of adequacy of the proposed scientific-technological solutions and the R&D plan to the challenge posed, the contractors will carry out a feasibility study with which they will observe the operation and needs of the service receiving the solution by collecting all the necessary data in order to demonstrate the technical and economic viability of the proposed project in relation to the objective and the need raised.

At the end of this phase, they presented a final adapted version of the supporting documentation for the proposed solution that will be submitted to an evaluation process.

For this evaluation, the evaluation and selection criteria established in this Phase 1 were used and, for access to Phase II, a single contractor could be selected for each lot from among those that obtained the best score provided that their solutions obtained the score minimum required.

PHASE II, DEVELOPMENT OF THE SOLUTION.

STATUS: Running

This phase is focused on obtaining, by the selected contractors, a development of the solution that allows evaluating the capacity of the proposed solution to meet and even exceed the Functional Requirements indicated in the specifications, making - where appropriate - all the adjustments necessary tests and additional studies that are necessary to complete it.

LOT 1 BUDGET. DATA ENRICHMENT

Concept

Budget without VAT

VAT

Total

Phase 1

44.628,10 €

9.371,90 €

54.000,00 €

Phase 2

393.388,43 €

82.611,57 €

476.000,00 €

Total

438.016,53 €

91.983,47 €

530.000,00 €

LOT 2 BUDGET. ANALYSIS METHODS

Concept

Budget without VAT

VAT

Total

Phase1


29.752,07 €


6.247,93 €


36.000,00 €

Phase 2


284.297,52 €


59.702,48 €


344.000,00 €

Total


314.049,59 €


65.950,41 €


380.000,00 €

DE (DATA ENRICHMENT)

DEAM LOT 1: DATA ENRICHMENT

Milestone 1. March 21

  • Metadata models for the description and annotation of FAIR RO (Research Object) 70%

Milestone 2. June 21

  • Metadata templates for FAIR RO description and annotation 85%
  • Topical Extraction Library 40%
  • FAIR Research Objects 30% Enrichment Library

Milestone 3. October 21

  • Metadata templates for FAIR RO description and annotation 100%
  • 100% Topical Extraction Library
  • FAIR Research Objects 100% Enrichment Library
  • Library Semantic Similarity 60%

Milestone 4. January 22

  • 100% semantic similarity library
  • Results of the execution of the test plan 100%

Milestone 5. March 22 (End of project)

  • Implemented platform
  • Docker images of each module
  • Manuals
  • Test plan execution results
  • Source code
  • Training materials

 

MA (ANALYSIS METHODS)

DEAM LOT 2: ANALYSIS METHODS

Milestone 1. March 21

  • Analysis and design 80%. Completed design of the user interface and navigation model.
  • 70% digital model. Design of the indicator model (1st version).
  • Results of the execution of the digital model test plan. Tests executed on the version of the indicator model

Milestone 2. June 21

  • Analysis and design 90%.
  • 85% digital model. Design of the indicator model (2nd version)
  • Research Synergy Finder module. First development cycle completed (50%).
  • First version of the library of analysis, visualization and prediction methods
  • First version of the synergy detection library
  • First version of the projects module. First development cycle completed (50%)

Milestone 3 October 21

  • Analysis and design
  • Digital model. Indicator model design completed
  • Research Synergy Finder + Projects Module
  • Second version of the library of methods of analysis, visualization and prediction
  • Second version of the synergy detection library
  • Second version of the projects module. Second development cycle completed (90%)
  • First version of the development of the indicator model (50%)

Milestone 4. January 22

  • Project module completed - connection with the Research Management System (SGI)
  • Development of the indicator model completed

Milestone 5. March 22 (End of project)

  • Platform implemented. Development of all modules completed.
  • Cloud deployment architecture (Docker)
  • Completed system libraries (analysis, visualization, prediction and synergy methods)
  • Documentation and manuals
  • Test plan execution results
  • Source code repository delivery
  • Training material

 

 

 

Regarding Lot 1, the U.T.E. RIAM INTELEARNING LAB, S.L. - UNIVERSIDAD DE DEUSTO has been awarded to begin Phase 2 of the project and develop the proposed solution. The launch meeting took place on January 4, 2021.

Regarding Lot 2, the company RIAM INTELEARNING LAB, S.L has been awarded to begin Phase 2 of the project in which the proposed solution will be developed. The launch meeting for Phase 2 took place on November 24, 2020.

The documentation (spanish) is available in confluence through the following links

Enriquecimiento de Datos - Hércules - Confluence (um.es)

Métodos de Análisis - Hércules - Confluence (um.es)

Sesiones de Formación de la herramienta Enriquecimiento de Datos y Métodos de Análisis

 

imagen instructor

 

 

Formación técnica de ED/MA para Desarrollo 

 

Sesiones  Unidad de conocimiento Enlace a video Doc. Soporte
Sesión 1 Arquitectura lógica y física; Carga inicial; Sesión práctica

Introducción ED y MA

Hercules ED

Hercules MA

Arquitectura Hércules ED

Arquitectura Hercules MA

Carga de CVs en formato CVN

Carga

Explicación servicios FECYT

Harvester

Proceso general de carga

Requisitos previos

Servicio de importación

Servicio OAIPMH

Documento de soporte a la formación (6-9-22)
Sesión 2  Configuracion depáginas, busquedas y fichas de consulta

Configuración de páginas CMS

Configuración de páginas - Páginas de búsqueda

Edición de vistas - Admón. de vistas y traducciones

Editor de vistas - compilador de vistas

Metabuscador

Documento de soporte a la formación (7-9-22)
Sesión 3 Configuracion de gráficas de indicadores Video sesión (9-9-22) Documento de soporte a la formación (9-9-22)
Sesión 4  Enriquecimiento de datos/ con descriptores temáticos y descriptores específicos (palabras clave) Video sesión  (12-9-22) Documento de soporte a la formación (12-9-22)
Sesión 5 Enriquecimiento de información; Enriquecimiento de información por similitud Video sesión (16-9-22) Documento soporte a la formación (14-9-22)
Sesión 6 Configuración de duplicación de items Video sesión (22-9-22) Documento soporte a la formación (22-9-22)
Sesión 7 Configuración de fuentes externas Video sesión 27-9-22 Documento soporte sesión 27-9-22
Sesión 8 Configuración de items del CV Video sesión 28-9-22 Documento soporte sesión 28-9-22

Formación funcional de ED/MA 

 

Sesiones  Unidad de conocimiento Enlace a video Doc. Soporte
Sesión 1 Presentación de la herramienta del portal de investigación

Video sesión 19-9-22

Documento soporte sesion 19-9-22
Sesión 2  Creación de cluster, Oferta tecnológica, Indicadores y configuración

Video sesión 20-9-22

Presentación soporte 20-9-22
Sesión 3

Administradores de la plataforma, panel general de indicadores y configuración del panel de indicadores general

Video sesion 26-9-22

 

Documento soporte sesion Administradores de la plataforma
Sesión 4 Gestores de transferencia, Gestión de oferta tecnológica Video sesión transferencia 27-9-22 Documento soporte sesión transferencia 27-9-22
Sesión 5

Formación al investigador, Edición del CV, Importación CV, Gestión panel indicadores

Video sesión 27-9-22 Documento soporte sesión 27-9-22
Sesión 6

Formación al investigador. Enriquecimiento de datos, validaciones y exportación del CV

Video sesión 28-9-22 Documento soporte sesión 28-9-22

Formación técnica de ED/MA para Sistemas

 

Sesiones  Unidad de conocimiento Enlace a video Doc. Soporte
Sesión 1
  • Arquitectura lógica y física
  • Despliegue

 

Enlace a Video sesion

Documento soporte Sesion 1 (18/10/2022)
Sesión 2 
  • Sesión práctica
Enlace a Video sesion Documento soporte Sesion 2 (19/10/2022)
Sesión 3
  • Carga inicial
  • Configuración de fuentes externas
Enlace a Video sesion Documento soporte Sesion 3 (21/10/2022)
Sesión 4
  • Monitorización y mantenimiento de ED
Enlace a Video sesion Documento soporte Sesion 4 (25/10/2022)

 

 

— 3 Items per Page
Showing 1 - 3 of 4 results.

Logo Proyecto Hércules


Oficina de proyecto Hércules

3ª Planta Edificio de Servicios Integrados. Facultad de Medicina
Campus de Espinardo
Universidad de Murcia Murcia
Tlf: +34 868 88 7184/3550
Mail: hercules@um.es