- Getting practical with {G}eo{SPARQL} and {A}pache {J}enaIn: Timo Homburg, Beyza Yaman, Mohamed Ahmed Sherif and Axel-Cyrille Ngonga Ngomo (eds.): Proceedings of the 6th International Workshop on Geospatial Linked Data 2024 co-located with 21st Extended Semantic Web Conference (ESWC 2024), {CEUR} Workshop Proceedings. vol. 3743. Hersonissos, GreeceSimon Bin, Claus Stadler, Lorenz B{ü}hmann and Michael MartinThis paper explores the integration of geo-spatial data into RDF (Resource Description Framework) using Apache Jena, a popular Java-based framework for building Semantic Web applications. We explain the basic representation of geo-spatial data in RDF with a focus on both the new GeoSPARQL 1.1 standard and Apache Jena. Our investigation covers advanced techniques, such as transformation of coordinate reference systems, aggregation of geo-spatial data, creation of new geo-objects, and simplification of polygons. Additionally, we discuss the usage of the H3 Grid as a Discrete Global Grid System (DGGS) for geo-spatial conversion. Furthermore, we present performance optimisations specific to Apache Jena, including per-graph geo-indexing, improved geo-index serialization for faster startup times, and manual optimisation of geo-spatial queries. We conclude with a comparison of different geo-functions and outline future directions for enhancing geo-spatial data management in RDF.
@inproceedings{bin2024geosparql,
abstract = {This paper explores the integration of geo-spatial data into RDF (Resource Description Framework) using Apache Jena, a popular Java-based framework for building Semantic Web applications. We explain the basic representation of geo-spatial data in RDF with a focus on both the new GeoSPARQL 1.1 standard and Apache Jena. Our investigation covers advanced techniques, such as transformation of coordinate reference systems, aggregation of geo-spatial data, creation of new geo-objects, and simplification of polygons. Additionally, we discuss the usage of the H3 Grid as a Discrete Global Grid System (DGGS) for geo-spatial conversion. Furthermore, we present performance optimisations specific to Apache Jena, including per-graph geo-indexing, improved geo-index serialization for faster startup times, and manual optimisation of geo-spatial queries. We conclude with a comparison of different geo-functions and outline future directions for enhancing geo-spatial data management in RDF.},
address = {Hersonissos, Greece},
author = {Bin, Simon and Stadler, Claus and B{ü}hmann, Lorenz and Martin, Michael},
booktitle = {Proceedings of the 6th International Workshop on Geospatial Linked Data 2024 co-located with 21st Extended Semantic Web Conference (ESWC 2024)},
editor = {Homburg, Timo and Yaman, Beyza and Sherif, Mohamed Ahmed and Ngomo, Axel-Cyrille Ngonga},
keywords = {sys:relevantFor:infai},
month = {05},
series = {{CEUR} Workshop Proceedings},
title = {Getting practical with {G}eo{SPARQL} and {A}pache {J}ena},
volume = 3743,
year = 2024
}%0 Conference Paper
%1 bin2024geosparql
%A Bin, Simon
%A Stadler, Claus
%A B{ü}hmann, Lorenz
%A Martin, Michael
%B Proceedings of the 6th International Workshop on Geospatial Linked Data 2024 co-located with 21st Extended Semantic Web Conference (ESWC 2024)
%C Hersonissos, Greece
%D 2024
%E Homburg, Timo
%E Yaman, Beyza
%E Sherif, Mohamed Ahmed
%E Ngomo, Axel-Cyrille Ngonga
%T Getting practical with {G}eo{SPARQL} and {A}pache {J}ena
%U https://ceur-ws.org/Vol-3743/paper2.pdf
%V 3743
%X This paper explores the integration of geo-spatial data into RDF (Resource Description Framework) using Apache Jena, a popular Java-based framework for building Semantic Web applications. We explain the basic representation of geo-spatial data in RDF with a focus on both the new GeoSPARQL 1.1 standard and Apache Jena. Our investigation covers advanced techniques, such as transformation of coordinate reference systems, aggregation of geo-spatial data, creation of new geo-objects, and simplification of polygons. Additionally, we discuss the usage of the H3 Grid as a Discrete Global Grid System (DGGS) for geo-spatial conversion. Furthermore, we present performance optimisations specific to Apache Jena, including per-graph geo-indexing, improved geo-index serialization for faster startup times, and manual optimisation of geo-spatial queries. We conclude with a comparison of different geo-functions and outline future directions for enhancing geo-spatial data management in RDF. - {KGCW2024} Challenge Report: {RDF}{P}rocessing{T}oolkitIn: David Chaves{-}Fraga, Anastasia Dimou, Ana Iglesias{-}Molina, Umutcan Serles and Dylan Van Assche (eds.): Proceedings of the 5th International Workshop on Knowledge Graph Construction co-located with 21th Extended Semantic Web Conference {(ESWC} 2024), {CEUR} Workshop Proceedings. vol. 3718. Hersonissos, GreeceClaus Stadler and Simon BinThis is the report of the participation of the RDFProcessingToolkit (RPT) in the KGCW2024 Challenge at ESWC 2024. The RPT system processes RML specifications by translating them into a series of extended SPARQL CONSTRUCT queries. The necessary SPARQL extensions are provided as plugins for the Apache Jena framework. This year’s challenge comprises a performance and a conformance track. For the performance track, a homogeneous environment was kindly provided by the workshop organizers in order to facilitate comparability of measurements. In this track, we mainly adapted the setup from our last year’s participation. For the conformance track, we updated our system with support for the rml-core module of the upcoming RML revision. We also report on the issues and shortcomings we encountered as a base for future improvements.
@inproceedings{DBLP:conf/kgcw/StadlerB24,
abstract = {This is the report of the participation of the RDFProcessingToolkit (RPT) in the KGCW2024 Challenge at ESWC 2024. The RPT system processes RML specifications by translating them into a series of extended SPARQL CONSTRUCT queries. The necessary SPARQL extensions are provided as plugins for the Apache Jena framework. This year’s challenge comprises a performance and a conformance track. For the performance track, a homogeneous environment was kindly provided by the workshop organizers in order to facilitate comparability of measurements. In this track, we mainly adapted the setup from our last year’s participation. For the conformance track, we updated our system with support for the rml-core module of the upcoming RML revision. We also report on the issues and shortcomings we encountered as a base for future improvements.},
address = {Hersonissos, Greece},
author = {Stadler, Claus and Bin, Simon},
booktitle = {Proceedings of the 5th International Workshop on Knowledge Graph Construction co-located with 21th Extended Semantic Web Conference {(ESWC} 2024)},
editor = {Chaves{-}Fraga, David and Dimou, Anastasia and Iglesias{-}Molina, Ana and Serles, Umutcan and Assche, Dylan Van},
keywords = {sys:relevantFor:infai},
month = {05},
series = {{CEUR} Workshop Proceedings},
title = {{KGCW2024} Challenge Report: {RDF}{P}rocessing{T}oolkit},
volume = 3718,
year = 2024
}%0 Conference Paper
%1 DBLP:conf/kgcw/StadlerB24
%A Stadler, Claus
%A Bin, Simon
%B Proceedings of the 5th International Workshop on Knowledge Graph Construction co-located with 21th Extended Semantic Web Conference {(ESWC} 2024)
%C Hersonissos, Greece
%D 2024
%E Chaves{-}Fraga, David
%E Dimou, Anastasia
%E Iglesias{-}Molina, Ana
%E Serles, Umutcan
%E Assche, Dylan Van
%T {KGCW2024} Challenge Report: {RDF}{P}rocessing{T}oolkit
%U https://ceur-ws.org/Vol-3718/paper13.pdf
%V 3718
%X This is the report of the participation of the RDFProcessingToolkit (RPT) in the KGCW2024 Challenge at ESWC 2024. The RPT system processes RML specifications by translating them into a series of extended SPARQL CONSTRUCT queries. The necessary SPARQL extensions are provided as plugins for the Apache Jena framework. This year’s challenge comprises a performance and a conformance track. For the performance track, a homogeneous environment was kindly provided by the workshop organizers in order to facilitate comparability of measurements. In this track, we mainly adapted the setup from our last year’s participation. For the conformance track, we updated our system with support for the rml-core module of the upcoming RML revision. We also report on the issues and shortcomings we encountered as a base for future improvements. - {FAIR} Data Publishing with Apache MavenIn: Leyla Jael Castro, Dietrich Rebholz-Schuhmann, Danilo Dessì and Sonja Schimmler (eds.): Proceedings of the Fourth Workshop on Metadata and Research (objects) Management for Linked Open Science — DaMaLOS 2024 co-located with Extended Semantic Web Conference (ESWC). Hersonissos, Greece : PUBLISSOClaus Stadler, Lorenz Bühmann and Simon Bin
@inproceedings{Stadler2024fair,
address = {Hersonissos, Greece},
author = {Stadler, Claus and Bühmann, Lorenz and Bin, Simon},
booktitle = {Proceedings of the Fourth Workshop on Metadata and Research (objects) Management for Linked Open Science — DaMaLOS 2024 co-located with Extended Semantic Web Conference (ESWC)},
editor = {Castro, Leyla Jael and Rebholz-Schuhmann, Dietrich and Dessì, Danilo and Schimmler, Sonja},
keywords = {sys:relevantFor:infai},
month = {05},
publisher = {PUBLISSO},
title = {{FAIR} Data Publishing with Apache Maven},
year = 2024
}%0 Conference Paper
%1 Stadler2024fair
%A Stadler, Claus
%A Bühmann, Lorenz
%A Bin, Simon
%B Proceedings of the Fourth Workshop on Metadata and Research (objects) Management for Linked Open Science — DaMaLOS 2024 co-located with Extended Semantic Web Conference (ESWC)
%C Hersonissos, Greece
%D 2024
%E Castro, Leyla Jael
%E Rebholz-Schuhmann, Dietrich
%E Dessì, Danilo
%E Schimmler, Sonja
%I PUBLISSO
%R 10.4126/FRL01-006474023
%T {FAIR} Data Publishing with Apache Maven
%U https://repository.publisso.de/resource/frl:6483281/data - Base Platform for Knowledge Graphs with Free SoftwareIn: Sebastian Tramp, Ricardo Usbeck, Natanael Arndt, Julia Holze and Sören Auer (eds.): Proceedings of the International Workshop on Linked Data-driven Resilience Research 2023, {CEUR} Workshop Proceedings. vol. 3401. Hersonissos, GreeceSimon Bin, Claus Stadler, Norman Radtke, Kurt Junghanns, Sabine Gründer-Fahrer and Michael MartinWe present an Open Source base platform for the CoyPu knowledge graph project in the resilience domain. We report on our experiences with several tools which are used to create, maintain, serve, view and explore a modular large-scale knowledge graph, as well as the adaptions that were necessary to enable frictionless interaction from both performance and usability perspectives. For this purpose, several adjustments had to be made. We provide a broad view of different programs which are of relevance to this domain. We demonstrate that while it is already possible to achieve good results with free software, there are still several pain points that need to be addressed. Resolution of these issues is often not only a matter of configuration but requires modification of the source code as well.
@inproceedings{bin-2023–base-platform,
abstract = {We present an Open Source base platform for the CoyPu knowledge graph project in the resilience domain. We report on our experiences with several tools which are used to create, maintain, serve, view and explore a modular large-scale knowledge graph, as well as the adaptions that were necessary to enable frictionless interaction from both performance and usability perspectives. For this purpose, several adjustments had to be made. We provide a broad view of different programs which are of relevance to this domain. We demonstrate that while it is already possible to achieve good results with free software, there are still several pain points that need to be addressed. Resolution of these issues is often not only a matter of configuration but requires modification of the source code as well.},
address = {Hersonissos, Greece},
author = {Bin, Simon and Stadler, Claus and Radtke, Norman and Junghanns, Kurt and Gründer-Fahrer, Sabine and Martin, Michael},
booktitle = {Proceedings of the International Workshop on Linked Data-driven Resilience Research 2023},
editor = {Tramp, Sebastian and Usbeck, Ricardo and Arndt, Natanael and Holze, Julia and Auer, Sören},
keywords = {sys:relevantFor:infai},
month = {05},
series = {{CEUR} Workshop Proceedings},
title = {Base Platform for Knowledge Graphs with Free Software},
volume = 3401,
year = 2023
}%0 Conference Paper
%1 bin-2023–base-platform
%A Bin, Simon
%A Stadler, Claus
%A Radtke, Norman
%A Junghanns, Kurt
%A Gründer-Fahrer, Sabine
%A Martin, Michael
%B Proceedings of the International Workshop on Linked Data-driven Resilience Research 2023
%C Hersonissos, Greece
%D 2023
%E Tramp, Sebastian
%E Usbeck, Ricardo
%E Arndt, Natanael
%E Holze, Julia
%E Auer, Sören
%T Base Platform for Knowledge Graphs with Free Software
%U https://ceur-ws.org/Vol-3401/paper6.pdf
%V 3401
%X We present an Open Source base platform for the CoyPu knowledge graph project in the resilience domain. We report on our experiences with several tools which are used to create, maintain, serve, view and explore a modular large-scale knowledge graph, as well as the adaptions that were necessary to enable frictionless interaction from both performance and usability perspectives. For this purpose, several adjustments had to be made. We provide a broad view of different programs which are of relevance to this domain. We demonstrate that while it is already possible to achieve good results with free software, there are still several pain points that need to be addressed. Resolution of these issues is often not only a matter of configuration but requires modification of the source code as well. - {KGCW}2023 Challenge Report {RDF}{P}rocessing{T}oolkit / SansaIn: 4th International Workshop on Knowledge Graph Construction @ ESWC 2023, CEUR workshop proceedings. Hersonissos, GreeceSimon Bin, Claus Stadler and Lorenz BühmannThis is the report of our participation in the KGCW2023 Challenge @ ESWC 2023 with our RDFProcessingToolkit/Sansa system which won the “fastest” tool award. The challenge was about the construction of RDF knowledge graphs from RML specifications with varying complexity in regard to the mix of input formats, characteristics of the data and the needed join operations. We detail how we integrated our tool into the provided benchmark framework. Thereby we also report on the issues and shortcomings we encountered as a base for future improvements. Furthermore, we provide an analysis of the data measured with the benchmark framework.
@inproceedings{stadler2023-kgcw-challenge,
abstract = {This is the report of our participation in the KGCW2023 Challenge @ ESWC 2023 with our RDFProcessingToolkit/Sansa system which won the “fastest” tool award. The challenge was about the construction of RDF knowledge graphs from RML specifications with varying complexity in regard to the mix of input formats, characteristics of the data and the needed join operations. We detail how we integrated our tool into the provided benchmark framework. Thereby we also report on the issues and shortcomings we encountered as a base for future improvements. Furthermore, we provide an analysis of the data measured with the benchmark framework.},
address = {Hersonissos, Greece},
author = {Bin, Simon and Stadler, Claus and Bühmann, Lorenz},
booktitle = {4th International Workshop on Knowledge Graph Construction @ ESWC 2023},
keywords = {sys:relevantFor:infai},
number = 3471,
series = {CEUR workshop proceedings},
title = {{KGCW}2023 Challenge Report {RDF}{P}rocessing{T}oolkit / Sansa},
year = 2023
}%0 Conference Paper
%1 stadler2023-kgcw-challenge
%A Bin, Simon
%A Stadler, Claus
%A Bühmann, Lorenz
%B 4th International Workshop on Knowledge Graph Construction @ ESWC 2023
%C Hersonissos, Greece
%D 2023
%N 3471
%T {KGCW}2023 Challenge Report {RDF}{P}rocessing{T}oolkit / Sansa
%U https://ceur-ws.org/Vol-3471/paper12.pdf
%X This is the report of our participation in the KGCW2023 Challenge @ ESWC 2023 with our RDFProcessingToolkit/Sansa system which won the “fastest” tool award. The challenge was about the construction of RDF knowledge graphs from RML specifications with varying complexity in regard to the mix of input formats, characteristics of the data and the needed join operations. We detail how we integrated our tool into the provided benchmark framework. Thereby we also report on the issues and shortcomings we encountered as a base for future improvements. Furthermore, we provide an analysis of the data measured with the benchmark framework. - Semantification of Geospatial Information for Enriched Knowledge Representation in Context of Crisis InformaticsIn: Natanael Arndt, Sabine Gründer-Fahrer, Julia Holze, Michael Martin and Sebastian Tramp (eds.): Proceedings of the International Workshop on Data-driven Resilience Research 2022, {CEUR} Workshop Proceedings. vol. 3376. Leipzig, GermanyClaus Stadler, Simon Bin, Lorenz Bühmann, Norman Radtke, Kurt Junghanns, Sabine Gründer-Fahrer and Michael MartinIn the context of crisis informatics, the integration and exploitation of high volumes of heterogeneous data from multiple sources is one of the big chances as well as challenges up to now. Semantic Web technologies have proven a valuable means to integrate and represent knowledge on the basis of domain concepts which improves interoperability and interpretability of information resources and allows deriving more knowledge via semantic relations and reasoning. In this paper, we investigate the potential of representing and processing geospatial information within the semantic paradigm. We show, on the technical level, how existing open source means can be used and supplemented as to efficiently handle geographic information and to convey exemplary results highly relevant in context of crisis management applications. When given semantic resources get enriched with geospatial information, new information can be retrieved combining the concepts of multi-polygons and geo-coordinates and using the power of GeoSPARQL queries. Custom SPARQL extension functions and data types for JSON, XML and CSV as well as for dialects such as GeoJSON and GML allow for succinct integration of heterogeneous data. We implemented these features for the Apache Jena Semantic Web framework by leveraging its plugin systems. Furthermore, significant improvements w.r.t. GeoSPARQL query performance have been contributed to the framework.
@inproceedings{stadler-c-2022–geospacial,
abstract = {In the context of crisis informatics, the integration and exploitation of high volumes of heterogeneous data from multiple sources is one of the big chances as well as challenges up to now. Semantic Web technologies have proven a valuable means to integrate and represent knowledge on the basis of domain concepts which improves interoperability and interpretability of information resources and allows deriving more knowledge via semantic relations and reasoning. In this paper, we investigate the potential of representing and processing geospatial information within the semantic paradigm. We show, on the technical level, how existing open source means can be used and supplemented as to efficiently handle geographic information and to convey exemplary results highly relevant in context of crisis management applications. When given semantic resources get enriched with geospatial information, new information can be retrieved combining the concepts of multi-polygons and geo-coordinates and using the power of GeoSPARQL queries. Custom SPARQL extension functions and data types for JSON, XML and CSV as well as for dialects such as GeoJSON and GML allow for succinct integration of heterogeneous data. We implemented these features for the Apache Jena Semantic Web framework by leveraging its plugin systems. Furthermore, significant improvements w.r.t. GeoSPARQL query performance have been contributed to the framework.},
address = {Leipzig, Germany},
author = {Stadler, Claus and Bin, Simon and Bühmann, Lorenz and Radtke, Norman and Junghanns, Kurt and Gründer-Fahrer, Sabine and Martin, Michael},
booktitle = {Proceedings of the International Workshop on Data-driven Resilience Research 2022},
editor = {Arndt, Natanael and Gründer-Fahrer, Sabine and Holze, Julia and Martin, Michael and Tramp, Sebastian},
keywords = {sys:relevantFor:infai},
month = {07},
series = {{CEUR} Workshop Proceedings},
title = {Semantification of Geospatial Information for Enriched Knowledge Representation in Context of Crisis Informatics},
volume = 3376,
year = 2022
}%0 Conference Paper
%1 stadler-c-2022–geospacial
%A Stadler, Claus
%A Bin, Simon
%A Bühmann, Lorenz
%A Radtke, Norman
%A Junghanns, Kurt
%A Gründer-Fahrer, Sabine
%A Martin, Michael
%B Proceedings of the International Workshop on Data-driven Resilience Research 2022
%C Leipzig, Germany
%D 2022
%E Arndt, Natanael
%E Gründer-Fahrer, Sabine
%E Holze, Julia
%E Martin, Michael
%E Tramp, Sebastian
%T Semantification of Geospatial Information for Enriched Knowledge Representation in Context of Crisis Informatics
%U https://ceur-ws.org/Vol-3376/paper03.pdf
%V 3376
%X In the context of crisis informatics, the integration and exploitation of high volumes of heterogeneous data from multiple sources is one of the big chances as well as challenges up to now. Semantic Web technologies have proven a valuable means to integrate and represent knowledge on the basis of domain concepts which improves interoperability and interpretability of information resources and allows deriving more knowledge via semantic relations and reasoning. In this paper, we investigate the potential of representing and processing geospatial information within the semantic paradigm. We show, on the technical level, how existing open source means can be used and supplemented as to efficiently handle geographic information and to convey exemplary results highly relevant in context of crisis management applications. When given semantic resources get enriched with geospatial information, new information can be retrieved combining the concepts of multi-polygons and geo-coordinates and using the power of GeoSPARQL queries. Custom SPARQL extension functions and data types for JSON, XML and CSV as well as for dialects such as GeoJSON and GML allow for succinct integration of heterogeneous data. We implemented these features for the Apache Jena Semantic Web framework by leveraging its plugin systems. Furthermore, significant improvements w.r.t. GeoSPARQL query performance have been contributed to the framework. - Spatial concept learning and inference on geospatial polygon dataIn: Knowl. Based Syst. vol. 241Patrick Westphal, Tobias Grubenmann, Diego Collarana, Simon Bin, Lorenz Bühmann and Jens Lehmann
@article{WestphalGCBB022,
author = {Westphal, Patrick and Grubenmann, Tobias and Collarana, Diego and Bin, Simon and Bühmann, Lorenz and Lehmann, Jens},
journal = {Knowl. Based Syst.},
keywords = {sys:relevantFor:infai},
title = {Spatial concept learning and inference on geospatial polygon data},
volume = 241,
year = 2022
}%0 Journal Article
%1 WestphalGCBB022
%A Westphal, Patrick
%A Grubenmann, Tobias
%A Collarana, Diego
%A Bin, Simon
%A Bühmann, Lorenz
%A Lehmann, Jens
%D 2022
%J Knowl. Based Syst.
%R 10.1016/j.knosys.2022.108233
%T Spatial concept learning and inference on geospatial polygon data
%U https://svn.aksw.org/papers/2022/knosys_spatial_concept_learning/public.pdf
%V 241 - Schema-agnostic SPARQL-driven faceted search benchmark generationIn: Journal of Web Semantics, p. 100614Claus Stadler, Simon Bin, Lisa Wenige, Lorenz Bühmann and Jens LehmannIn this work, we present a schema-agnostic faceted browsing benchmark generation framework for RDF data and SPARQL engines. Faceted search is a technique that allows narrowing down sets of information items by applying constraints over their properties, whereas facets correspond to properties of these items. While our work can be used to realise real-world faceted search user interfaces, our focus lies on the construction and benchmarking of faceted search queries over knowledge graphs. The RDF model exhibits several traits that seemingly make it a natural foundation for faceted search: all information items are represented as RDF resources, property values typically already correspond to meaningful semantic classifications, and with SPARQL there is a standard language for uniformly querying instance and schema information. However, although faceted search is ubiquitous today, it is typically not performed on the RDF model directly. Two major sources of concern are the complexity of query generation and the query performance. To overcome the former, our framework comes with an intermediate domain-specific language. Thereby our approach is SPARQL-driven which means that every faceted search information need is intensionally expressed as a single SPARQL query. In regard to the latter, we investigate the possibilities and limits of real-time SPARQL-driven faceted search on contemporary triple stores. We report on our findings by evaluating systems performance and correctness characteristics when executing a benchmark generated using our generation framework. All components, namely the benchmark generator, the benchmark runners and the underlying faceted search framework, are published freely available as open source.
@article{stadler2020facete,
abstract = {In this work, we present a schema-agnostic faceted browsing benchmark generation framework for RDF data and SPARQL engines. Faceted search is a technique that allows narrowing down sets of information items by applying constraints over their properties, whereas facets correspond to properties of these items. While our work can be used to realise real-world faceted search user interfaces, our focus lies on the construction and benchmarking of faceted search queries over knowledge graphs. The RDF model exhibits several traits that seemingly make it a natural foundation for faceted search: all information items are represented as RDF resources, property values typically already correspond to meaningful semantic classifications, and with SPARQL there is a standard language for uniformly querying instance and schema information. However, although faceted search is ubiquitous today, it is typically not performed on the RDF model directly. Two major sources of concern are the complexity of query generation and the query performance. To overcome the former, our framework comes with an intermediate domain-specific language. Thereby our approach is SPARQL-driven which means that every faceted search information need is intensionally expressed as a single SPARQL query. In regard to the latter, we investigate the possibilities and limits of real-time SPARQL-driven faceted search on contemporary triple stores. We report on our findings by evaluating systems performance and correctness characteristics when executing a benchmark generated using our generation framework. All components, namely the benchmark generator, the benchmark runners and the underlying faceted search framework, are published freely available as open source.},
author = {Stadler, Claus and Bin, Simon and Wenige, Lisa and Bühmann, Lorenz and Lehmann, Jens},
journal = {Journal of Web Semantics},
keywords = {sys:relevantFor:infai},
pages = 100614,
title = {Schema-agnostic SPARQL-driven faceted search benchmark generation},
year = 2020
}%0 Journal Article
%1 stadler2020facete
%A Stadler, Claus
%A Bin, Simon
%A Wenige, Lisa
%A Bühmann, Lorenz
%A Lehmann, Jens
%D 2020
%J Journal of Web Semantics
%P 100614
%R https://doi.org/10.1016/j.websem.2020.100614
%T Schema-agnostic SPARQL-driven faceted search benchmark generation
%U https://svn.aksw.org/papers/2020/JWS_Faceted_Search_Benchmark/public.pdf
%X In this work, we present a schema-agnostic faceted browsing benchmark generation framework for RDF data and SPARQL engines. Faceted search is a technique that allows narrowing down sets of information items by applying constraints over their properties, whereas facets correspond to properties of these items. While our work can be used to realise real-world faceted search user interfaces, our focus lies on the construction and benchmarking of faceted search queries over knowledge graphs. The RDF model exhibits several traits that seemingly make it a natural foundation for faceted search: all information items are represented as RDF resources, property values typically already correspond to meaningful semantic classifications, and with SPARQL there is a standard language for uniformly querying instance and schema information. However, although faceted search is ubiquitous today, it is typically not performed on the RDF model directly. Two major sources of concern are the complexity of query generation and the query performance. To overcome the former, our framework comes with an intermediate domain-specific language. Thereby our approach is SPARQL-driven which means that every faceted search information need is intensionally expressed as a single SPARQL query. In regard to the latter, we investigate the possibilities and limits of real-time SPARQL-driven faceted search on contemporary triple stores. We report on our findings by evaluating systems performance and correctness characteristics when executing a benchmark generated using our generation framework. All components, namely the benchmark generator, the benchmark runners and the underlying faceted search framework, are published freely available as open source. - Automatic Subject Indexing with Knowledge GraphsIn: LASCAR Workshop at the Extended Semantic Web Conference (ESWC)Lisa Wenige, Claus Stadler, Simon Bin, Lorenz Bühmann, Kurt Junghanns and Michael Martin
@inproceedings{wenige2020kindex,
author = {Wenige, Lisa and Stadler, Claus and Bin, Simon and Bühmann, Lorenz and Junghanns, Kurt and Martin, Michael},
booktitle = {LASCAR Workshop at the Extended Semantic Web Conference (ESWC)},
keywords = {sys:relevantFor:infai},
title = {Automatic Subject Indexing with Knowledge Graphs},
year = 2020
}%0 Conference Paper
%1 wenige2020kindex
%A Wenige, Lisa
%A Stadler, Claus
%A Bin, Simon
%A Bühmann, Lorenz
%A Junghanns, Kurt
%A Martin, Michael
%B LASCAR Workshop at the Extended Semantic Web Conference (ESWC)
%D 2020
%T Automatic Subject Indexing with Knowledge Graphs
%U https://svn.aksw.org/papers/2020/LASCAR_Kindex/public.pdf - DL-Learner – Structured Machine Learning on Semantic Web DataIn: The Web Conf (WWW) 2018 Journals TrackLorenz Buehmann, Jens Lehmann, Patrick Westphal and Simon Bin
@inproceedings{www2018dllearner,
author = {Buehmann, Lorenz and Lehmann, Jens and Westphal, Patrick and Bin, Simon},
booktitle = {The Web Conf (WWW) 2018 Journals Track},
keywords = {sys:relevantFor:infai},
title = {DL-Learner – Structured Machine Learning on Semantic Web Data},
year = 2018
}%0 Conference Paper
%1 www2018dllearner
%A Buehmann, Lorenz
%A Lehmann, Jens
%A Westphal, Patrick
%A Bin, Simon
%B The Web Conf (WWW) 2018 Journals Track
%D 2018
%T DL-Learner – Structured Machine Learning on Semantic Web Data
%U http://jens-lehmann.org/files/2018/www_dllearner.pdf - SML-Bench – A Benchmarking Framework for Structured Machine LearningIn: Semantic Web JournalPatrick Westphal, Lorenz Bühmann, Simon Bin, Hajira Jabeen and Jens Lehmann
@article{smlbench,
author = {Westphal, Patrick and Bühmann, Lorenz and Bin, Simon and Jabeen, Hajira and Lehmann, Jens},
journal = {Semantic Web Journal},
keywords = {sys:relevantFor:infai},
title = {SML-Bench – A Benchmarking Framework for Structured Machine Learning},
year = 2018
}%0 Journal Article
%1 smlbench
%A Westphal, Patrick
%A Bühmann, Lorenz
%A Bin, Simon
%A Jabeen, Hajira
%A Lehmann, Jens
%D 2018
%J Semantic Web Journal
%T SML-Bench – A Benchmarking Framework for Structured Machine Learning
%U http://jens-lehmann.org/files/2018/swj_sml_bench.pdf - Implementing Scalable Structured Machine Learning for Big Data in the SAKE ProjectIn: IEEE Big Data Conference 2017Simon Bin, Patrick Westphal, Jens Lehmann and Axel-Cyrille Ngomo Ngonga
@inproceedings{bin-2017-sake,
author = {Bin, Simon and Westphal, Patrick and Lehmann, Jens and Ngonga, Axel-Cyrille Ngomo},
booktitle = {IEEE Big Data Conference 2017},
keywords = {sys:relevantFor:infai},
title = {Implementing Scalable Structured Machine Learning for Big Data in the SAKE Project},
year = 2017
}%0 Conference Paper
%1 bin-2017-sake
%A Bin, Simon
%A Westphal, Patrick
%A Lehmann, Jens
%A Ngonga, Axel-Cyrille Ngomo
%B IEEE Big Data Conference 2017
%D 2017
%T Implementing Scalable Structured Machine Learning for Big Data in the SAKE Project
%U http://jens-lehmann.org/files/2017/ieee_bigdata_sake.pdf - The Tale of Sansa SparkIn: Proceedings of 16th International Semantic Web Conference, Poster \& DemosIvan Ermilov, Jens Lehmann, Gezim Sejdiu, Lorenz Bühmann, Patrick Westphal, Claus Stadler, Simon Bin, Nilesh Chakraborty, Henning Petzka, et al.
@inproceedings{iermilov-2017-sansa-iswc-demo,
author = {Ermilov, Ivan and Lehmann, Jens and Sejdiu, Gezim and Bühmann, Lorenz and Westphal, Patrick and Stadler, Claus and Bin, Simon and Chakraborty, Nilesh and Petzka, Henning and Saleem, Muhammad and Ngonga, Axel-Cyrille Ngomo and Jabeen, Hajira},
booktitle = {Proceedings of 16th International Semantic Web Conference, Poster \& Demos},
keywords = {buehmann},
title = {The Tale of Sansa Spark},
year = 2017
}%0 Conference Paper
%1 iermilov-2017-sansa-iswc-demo
%A Ermilov, Ivan
%A Lehmann, Jens
%A Sejdiu, Gezim
%A Bühmann, Lorenz
%A Westphal, Patrick
%A Stadler, Claus
%A Bin, Simon
%A Chakraborty, Nilesh
%A Petzka, Henning
%A Saleem, Muhammad
%A Ngonga, Axel-Cyrille Ngomo
%A Jabeen, Hajira
%B Proceedings of 16th International Semantic Web Conference, Poster \& Demos
%D 2017
%T The Tale of Sansa Spark
%U https://svn.aksw.org/papers/2017/ISWC_SANSA_Demo/public.pdf - Distributed Semantic Analytics using the SANSA StackIn: Proceedings of 16th International Semantic Web Conference — Resources Track (ISWC’2017) : Springer, pp. 147–155Jens Lehmann, Gezim Sejdiu, Lorenz Bühmann, Patrick Westphal, Claus Stadler, Ivan Ermilov, Simon Bin, Nilesh Chakraborty, Muhammad Saleem, et al.
@inproceedings{lehmann-2017-sansa-iswc,
author = {Lehmann, Jens and Sejdiu, Gezim and Bühmann, Lorenz and Westphal, Patrick and Stadler, Claus and Ermilov, Ivan and Bin, Simon and Chakraborty, Nilesh and Saleem, Muhammad and Ngonga, Axel-Cyrille Ngomo and Jabeen, Hajira},
booktitle = {Proceedings of 16th International Semantic Web Conference — Resources Track (ISWC’2017)},
keywords = {buehmann},
pages = {147–155},
publisher = {Springer},
title = {Distributed Semantic Analytics using the SANSA Stack},
year = 2017
}%0 Conference Paper
%1 lehmann-2017-sansa-iswc
%A Lehmann, Jens
%A Sejdiu, Gezim
%A Bühmann, Lorenz
%A Westphal, Patrick
%A Stadler, Claus
%A Ermilov, Ivan
%A Bin, Simon
%A Chakraborty, Nilesh
%A Saleem, Muhammad
%A Ngonga, Axel-Cyrille Ngomo
%A Jabeen, Hajira
%B Proceedings of 16th International Semantic Web Conference — Resources Track (ISWC’2017)
%D 2017
%I Springer
%P 147–155
%T Distributed Semantic Analytics using the SANSA Stack
%U http://svn.aksw.org/papers/2017/ISWC_SANSA_SoftwareFramework/public.pdf - Towards {SPARQL}-Based Induction for Large-Scale {RDF} Data setsIn: Gal A. Kaminka, Maria Fox, Paolo Bouquet, Eyke H{ü}llermeier, Virginia Dignum, Frank Dignum and Frank van Harmelen (eds.): ECAI 2016 — Proceedings of the 22nd European Conference on Artificial Intelligence, Frontiers in Artificial Intelligence and Applications. vol. 285 : IOS Press — ISBN 978–1‑61499–672‑9, pp. 1551–1552Simon Bin, Lorenz B{ü}hmann, Jens Lehmann and Axel-Cyrille {Ngonga Ngomo}
@inproceedings{sparqllearner,
author = {Bin, Simon and B{ü}hmann, Lorenz and Lehmann, Jens and {Ngonga Ngomo}, Axel-Cyrille},
booktitle = {ECAI 2016 — Proceedings of the 22nd European Conference on Artificial Intelligence},
editor = {Kaminka, Gal A. and Fox, Maria and Bouquet, Paolo and H{ü}llermeier, Eyke and Dignum, Virginia and Dignum, Frank and van Harmelen, Frank},
keywords = {buehmann},
pages = {1551–1552},
publisher = {IOS Press},
series = {Frontiers in Artificial Intelligence and Applications},
title = {Towards {SPARQL}-Based Induction for Large-Scale {RDF} Data sets},
volume = 285,
year = 2016
}%0 Conference Paper
%1 sparqllearner
%A Bin, Simon
%A B{ü}hmann, Lorenz
%A Lehmann, Jens
%A {Ngonga Ngomo}, Axel-Cyrille
%B ECAI 2016 — Proceedings of the 22nd European Conference on Artificial Intelligence
%D 2016
%E Kaminka, Gal A.
%E Fox, Maria
%E Bouquet, Paolo
%E H{ü}llermeier, Eyke
%E Dignum, Virginia
%E Dignum, Frank
%E van Harmelen, Frank
%I IOS Press
%P 1551–1552
%R 10.3233/978–1‑61499–672‑9–1551
%T Towards {SPARQL}-Based Induction for Large-Scale {RDF} Data sets
%U http://svn.aksw.org/papers/2016/ECAI_SPARQL_Learner/public.pdf
%V 285
%@ 978–1‑61499–672‑9 - Comparing the Optimization Behaviour of Heuristics with Topology Based VisualizationIn: Adrian-Horia Dediu, Manuel Lozano and Carlos Mart{\’\i}n‑Vide (eds.): Theory and Practice of Natural Computing, Lecture Notes in Computer Science. vol. 8890 : Springer International Publishing — ISBN 978–3‑319–13748‑3, pp. 47–58Simon Bin, Sebastian Volke, Gerik Scheuermann and Martin Middendorf
@incollection{bin2014comparing,
author = {Bin, Simon and Volke, Sebastian and Scheuermann, Gerik and Middendorf, Martin},
booktitle = {Theory and Practice of Natural Computing},
editor = {Dediu, Adrian-Horia and Lozano, Manuel and Mart{\’\i}n‑Vide, Carlos},
keywords = {landscape},
pages = {47–58},
publisher = {Springer International Publishing},
series = {Lecture Notes in Computer Science},
title = {Comparing the Optimization Behaviour of Heuristics with Topology Based Visualization},
volume = 8890,
year = 2014
}%0 Book Section
%1 bin2014comparing
%A Bin, Simon
%A Volke, Sebastian
%A Scheuermann, Gerik
%A Middendorf, Martin
%B Theory and Practice of Natural Computing
%D 2014
%E Dediu, Adrian-Horia
%E Lozano, Manuel
%E Mart{\’\i}n‑Vide, Carlos
%I Springer International Publishing
%P 47–58
%R 10.1007/978–3‑319–13749-0_5
%T Comparing the Optimization Behaviour of Heuristics with Topology Based Visualization
%U http://dx.doi.org/10.1007/978–3‑319–13749-0_5
%V 8890
%@ 978–3‑319–13748‑3 - Visual Analysis of Discrete Particle Swarm Optimization Using Fitness LandscapesIn: Hendrik Richter and Andries Engelbrecht (eds.): Recent Advances in the Theory and Application of Fitness Landscapes, Emergence, Complexity and Computation. vol. 6 : Springer Berlin Heidelberg — ISBN 978–3‑642–41887‑7, pp. 487–507Sebastian Volke, Simon Bin, Dirk Zeckzer, Martin Middendorf and Gerik Scheuermann
@incollection{volke2014visual,
author = {Volke, Sebastian and Bin, Simon and Zeckzer, Dirk and Middendorf, Martin and Scheuermann, Gerik},
booktitle = {Recent Advances in the Theory and Application of Fitness Landscapes},
editor = {Richter, Hendrik and Engelbrecht, Andries},
keywords = {landscape},
pages = {487–507},
publisher = {Springer Berlin Heidelberg},
series = {Emergence, Complexity and Computation},
title = {Visual Analysis of Discrete Particle Swarm Optimization Using Fitness Landscapes},
volume = 6,
year = 2014
}%0 Book Section
%1 volke2014visual
%A Volke, Sebastian
%A Bin, Simon
%A Zeckzer, Dirk
%A Middendorf, Martin
%A Scheuermann, Gerik
%B Recent Advances in the Theory and Application of Fitness Landscapes
%D 2014
%E Richter, Hendrik
%E Engelbrecht, Andries
%I Springer Berlin Heidelberg
%P 487–507
%R 10.1007/978–3‑642–41888-4_17
%T Visual Analysis of Discrete Particle Swarm Optimization Using Fitness Landscapes
%U http://dx.doi.org/10.1007/978–3‑642–41888-4_17
%V 6
%@ 978–3‑642–41887‑7
Simon Bin
Wissenschaftlicher Mitarbeiter
Institution
Institut für Angewandte Informatik (InfAI) e. V.
Forschungsschwerpunkte
Semantic Web, Knowledge Graphs, Web Ontology Language
Projekte
Publikationen
- Getting practical with {G}eo{SPARQL} and {A}pache {J}enaIn: Timo Homburg, Beyza Yaman, Mohamed Ahmed Sherif and Axel-Cyrille Ngonga Ngomo (eds.): Proceedings of the 6th International Workshop on Geospatial Linked Data 2024 co-located with 21st Extended Semantic Web Conference (ESWC 2024), {CEUR} Workshop Proceedings. vol. 3743. Hersonissos, Greece
- {KGCW2024} Challenge Report: {RDF}{P}rocessing{T}oolkitIn: David Chaves{-}Fraga, Anastasia Dimou, Ana Iglesias{-}Molina, Umutcan Serles and Dylan Van Assche (eds.): Proceedings of the 5th International Workshop on Knowledge Graph Construction co-located with 21th Extended Semantic Web Conference {(ESWC} 2024), {CEUR} Workshop Proceedings. vol. 3718. Hersonissos, Greece
- {FAIR} Data Publishing with Apache MavenIn: Leyla Jael Castro, Dietrich Rebholz-Schuhmann, Danilo Dessì and Sonja Schimmler (eds.): Proceedings of the Fourth Workshop on Metadata and Research (objects) Management for Linked Open Science — DaMaLOS 2024 co-located with Extended Semantic Web Conference (ESWC). Hersonissos, Greece : PUBLISSO
- Base Platform for Knowledge Graphs with Free SoftwareIn: Sebastian Tramp, Ricardo Usbeck, Natanael Arndt, Julia Holze and Sören Auer (eds.): Proceedings of the International Workshop on Linked Data-driven Resilience Research 2023, {CEUR} Workshop Proceedings. vol. 3401. Hersonissos, Greece
- {KGCW}2023 Challenge Report {RDF}{P}rocessing{T}oolkit / SansaIn: 4th International Workshop on Knowledge Graph Construction @ ESWC 2023, CEUR workshop proceedings. Hersonissos, Greece
- Semantification of Geospatial Information for Enriched Knowledge Representation in Context of Crisis InformaticsIn: Natanael Arndt, Sabine Gründer-Fahrer, Julia Holze, Michael Martin and Sebastian Tramp (eds.): Proceedings of the International Workshop on Data-driven Resilience Research 2022, {CEUR} Workshop Proceedings. vol. 3376. Leipzig, Germany
- Spatial concept learning and inference on geospatial polygon dataIn: Knowl. Based Syst. vol. 241
- Schema-agnostic SPARQL-driven faceted search benchmark generationIn: Journal of Web Semantics, p. 100614
- Automatic Subject Indexing with Knowledge GraphsIn: LASCAR Workshop at the Extended Semantic Web Conference (ESWC)
- DL-Learner – Structured Machine Learning on Semantic Web DataIn: The Web Conf (WWW) 2018 Journals Track
- SML-Bench – A Benchmarking Framework for Structured Machine LearningIn: Semantic Web Journal
- Implementing Scalable Structured Machine Learning for Big Data in the SAKE ProjectIn: IEEE Big Data Conference 2017
- The Tale of Sansa SparkIn: Proceedings of 16th International Semantic Web Conference, Poster \& Demos
- Distributed Semantic Analytics using the SANSA StackIn: Proceedings of 16th International Semantic Web Conference — Resources Track (ISWC’2017) : Springer, pp. 147–155
- Towards {SPARQL}-Based Induction for Large-Scale {RDF} Data setsIn: Gal A. Kaminka, Maria Fox, Paolo Bouquet, Eyke H{ü}llermeier, Virginia Dignum, Frank Dignum and Frank van Harmelen (eds.): ECAI 2016 — Proceedings of the 22nd European Conference on Artificial Intelligence, Frontiers in Artificial Intelligence and Applications. vol. 285 : IOS Press — ISBN 978–1‑61499–672‑9, pp. 1551–1552
- Comparing the Optimization Behaviour of Heuristics with Topology Based VisualizationIn: Adrian-Horia Dediu, Manuel Lozano and Carlos Mart{\’\i}n‑Vide (eds.): Theory and Practice of Natural Computing, Lecture Notes in Computer Science. vol. 8890 : Springer International Publishing — ISBN 978–3‑319–13748‑3, pp. 47–58
- Visual Analysis of Discrete Particle Swarm Optimization Using Fitness LandscapesIn: Hendrik Richter and Andries Engelbrecht (eds.): Recent Advances in the Theory and Application of Fitness Landscapes, Emergence, Complexity and Computation. vol. 6 : Springer Berlin Heidelberg — ISBN 978–3‑642–41887‑7, pp. 487–507