May 30, 2017

Published papers

GEOLSemantics develops its multilingual semantic extraction technology by participating to research and development projects in partnership at the national and European level.


SAIMSI project :


The SAIMSI project aims at realizing a system prototype which would accumulate the structured information about the actions of people suspected of illicit activities.

To know more..
This information is automatically extracted from the internet sources and this in different languages (French, English, Arabic and Chinese (mandarin) in different media (text and speech) and from different type of sources (web pages, press releases, social networks…). Within the framework of the project, we limited ourselves to open sources.


The information extracted from the various languages is represented using the standards of the semantic Web (RDF) independently from the languages and compliant with an ontology of the safety elaborated within the framework of the project. English was chosen to represent concepts and relations.


The collected information is managed within two databases: one knowledge base containing the structured information from the different documents and a textual database searchable in interlingua and containing the source documents. During the display of text in the textual DB you may ask for structured information from the knowledge base on a quoted entity (person, location, company…). Conversely, for every information of the knowledge base you may recover all original documents in the textual DB.



ORELO project :


ORELO aims at working out identification techniques of Arabic dialectal origin of a text written in Arabic or Latin characters or a quote. The dialects considered by the project are the main dialects of the Maghreb (Moroccan, Algerian, Tunisian) and the Egyptian. To know more..

The Maghreb dialects are not very much studied from the point of view of the processing by computer. The consideration of the Egyptian is going to allow comparisons with previous works which concern the Egyptian and the languages of Machrek. These preliminary works are essential so that Vocapia can spread its systems of automatic transcription of the standard Arabic word to the various dialects. It is also a prerequisite so that GEOLSemantics can reinforce its processing of extraction of strong standard knowledge in Arabic to the presence of dialectal words. The approach proposed by GEOLSemantics for the identification of the written dialects, based on the use of dictionaries of dialects, already supplies the necessary resources for continuation.



DRIRS project :


The DRIRS project aims to identify activities about promoting radical ideas on social networks, spot influences and establish circles of probable recruits. This is the upstream activity of radicalization that uses unencrypted networks to reach the maximum audience.




Other published papers :