Developing Advanced Data Mining and Subgroup Analysis Techniques for Bibliometric Research: The Biblium Python Package and Orange Add-on Orangebib (J5-50183)

Domov
Developing Advanced Data Mining and Subgroup Analysis Techniques for Bibliometric Research: The Biblium Python Package and Orange Add-on Orangebib (J5-50183)

Domov
Developing Advanced Data Mining and Subgroup Analysis Techniques for Bibliometric Research: The Biblium Python Package and Orange Add-on Orangebib (J5-50183)

Title: Developing Advanced Data Mining and Subgroup Analysis Techniques for Bibliometric Research: The Biblium Python Package and Orange Add-on Orangebib (J5-50183)

Head of the research group: izr. dr. Lan Umek

Bibliometric analysis has become increasingly important in recent years as a means of evaluating and analyzing the scientific literature. As the proportion of bibliometric documents in total scientific output increases dramatically, there is a need to use more advanced statistical methods, especially those related to data mining and subgroup analysis, to improve bibliometric analysis. Although subgroups occur naturally in bibliographic data (temporal dimension, geographic scope, topic, etc.), their evaluation and analysis has rarely been performed. In this project, we will present concrete examples of data mining methods that could be integrated into bibliometrics, especially in terms of prediction (classification and regression) and subgroup analysis.

To address this gap, we will be the first to implement two subgroup discovery approaches in bibliometrics. Both algorithms aim to discover subgroups of bibliographic documents that reflect significant relationships between two aspects, such as keywords and authors. The first algorithm combines a partitioning clustering approach with contingency table analysis and extracts subgroups of documents that reflect significant relationships between the analyzed aspects. The second algorithm will combine a hierarchical clustering approach and statistical classification techniques (such as logistic regression, support vector machines, neural networks, etc.) to extract subgroups that are similar with respect to one analyzed aspect and can be reliably separated from the rest of the documents by the second analyzed aspect.

As part of the project, we will implement basic and advanced bibliometric techniques in a Python package called Biblium. Biblium will be the most comprehensive Python package for bibliometric analysis, as it will integrate all the procedures from the R package Bibliometrix along with more sophisticated methods for analyzing bibliographic data, including data mining methods and subgroup analysis. In addition, we will perform the bibliometric analysis itself and implement several state-of-the-art approaches and visualizations that are implemented in different programs but are not under one umbrella.

In the final phase of the project, we will integrate Biblium with the open source data mining software Orange as its add-on Orangebib. This integration will combine bibliometric analysis with data mining methods in a user-friendly software that does not require programming skills to use. Together with existing Orange add-ons (bioinformatics, advanced text mining, geomaps, etc.), Orange users will be able to find new, creative ways to combine different aspects of bibliographic data and make an important contribution to the field of bibliometrics.

We plan to apply data mining and subgroup discovery techniques to several areas, including applications in the natural sciences (medicine, drug repurposing, genetics, etc.) and the social sciences (public administration, online learning, taxation, artificial intelligence, and disruptive technologies in the public sector, etc.).

We intend to publish several papers as results of the project, including software and methodology papers in leading journals of scientometrics and data mining, as well as application of the developed and implemented tools in several journals of natural and social sciences. We also plan to participate in several (inter)national conferences in the field of scientometrics, presenting Biblium and Orangebib. As a final deliverable, we plan to organize a free one-day online workshop where users will learn how to use Orangebib to easily perform advanced bibliometric analyzes.

Duration (from/to):

10. 2023 – 30. 9. 2026

Contracting Authority:

Slovenian Research and Innovation Agency

Financing:

The project is being financed with 2571 yearly hours (A price category) for 3 years.

More about the project:

https://www.project-hercules.si/nika/#contact

Project references

- BABŠEK, Matej, RAVŠELJ, Dejan, UMEK, Lan, ARISTOVNIK, Aleksander. Artificial intelligence adoption in public administration : an overview of top-cited articles and practical applications. AI. 2025, vol. 6, iss. 3, str. 1-25, ilustr. ISSN 2673-2688. https://www.mdpi.com/2673-2688/6/3/44, DOI: 10.3390/ai6030044.
- RAVŠELJ, Dejan, UMEK, Lan, TOSUN, Mehmet Serkan, ARISTOVNIK, Aleksander. Mapping fiscal research trajectories through bibliometric analysis : echoes of global crises in Central and Eastern Europe. NISPAcee journal of public administration and policy. 2024, vol. 17, iss. 1, str. 169-197. ISSN 1338-4309. https://sciendo.com/article/10.2478/nispa-2024-0008, Repozitorij Univerze v Ljubljani – RUL, dCOBISS, DOI: 10.2478/nispa-2024-0008.
- UMEK, Lan, RAVŠELJ, Dejan, ARISTOVNIK, Aleksander. Artificial intelligence and public administration in the sustainable development goals perspective : a bibliometric review and future research agenda. V: IIAS/DARPG India Conference 2025 : 10-14 February 2025, New Delhi, India : ConfTool Conference Administration. New Delhi: IIAS. 2025, str. 1-6, ilustr. https://www.conftool.net/iias-darpg-indiaconference2025/index.php?page=login, https://iias-iisa.org/iias-darpg-indiaconference2025/.
- BREZOVAR, Nejc, UMEK, Lan, RAVŠELJ, Dejan. Research trends in artificial intelligence and legislation : implications for the protection of human rights. V: IIAS/DARPG India Conference 2025 : 10-14 February 2025, New Delhi, India : ConfTool Conference Administration. New Delhi: IIAS. 2025, str. 1-14, ilustr. https://www.conftool.net/iias-darpg-indiaconference2025/index.php?page=login, https://iias-iisa.org/iias-darpg-indiaconference2025/.
- UMEK, Lan, ARISTOVNIK, Aleksander, RAVŠELJ, Dejan, KOVAČ, Polonca. Shaping reforms through public governance models : a bibliometric perspective. V: Alternative Service Delivery and Sustainable Societal Responsiveness : IASIA 2024 Conference : July 1-5, 2024, Bloemfontein, South Africa. Bloemfontein: International Association of Schools and Institutes of Administration. 2024, str. 1-14, ilustr. https://www.conftool.org/iasia-conference-2024/index.php?page=login, https://iias-iisa.org/events/iasia-2024-conference.
- RAVŠELJ, Dejan, UMEK, Lan, FATUR ŠIKIĆ, Tanja, ŠVERKO GRDIĆ, Zvonimira. Digital government transformation and economic aspects of sustainable development : an overview of recent research trends. V: EGPA 2024 Conference, Athens, 3-6 September 2024 : ConfTool Conference Administration. Athens: EGPA. 2024, str. 1-9, ilustr. https://www.conftool.org/egpa-conference2024/index.php?page=login, https://iias-iisa.org/egpa-2024-conference/.
- UMEK, Lan, TAKAHIRO, Miura, ARISTOVNIK, Aleksander, RAVŠELJ, Dejan. Collaborative governance and sustainable development : evidence from bibliometric analysis. V: IIAS Mombasa Conference 2024 : 26-29 February 2024, Mombasa, Kenya : ConfTool Conference Administration. Mombasa: IIAS. 2024, str. 1-8. https://www.conftool.org/iias-ksg-mombasaconference2024/index.php?page=index, https://iias-iisa.org/iias-2024-conference/.
- KOZJEK, Tatjana, ZOREC KLEMENČIČ, Uroška, UMEK, Lan. Volunteer motivation in firefighting organisations : a case of the Slovenian Firefighters Association. Fire. Jun. 2025, vol. 8, iss. 6, [article no.] 220, str. 1-17, ilustr. ISSN 2571-6255. https://www.mdpi.com/2571-6255/8/6/220, DOI: doi.org/10.3390/fire8060220.
- BRUTHANS, Jan, DUFTSCHMID, Georg, STANIMIROVIĆ, Dalibor, et al. Comparison of electronic prescription systems in the European Union : benchmarking development, use, and future trends. IEEE journal of biomedical and health informatics. [Print ed.]. May 2025, vol. 29, no. 5, str. 3712-3722, ilustr. ISSN 2168-2194. https://ieeexplore.ieee.org/document/10891153, DOI: 10.1109/JBHI.2025.3531317.
- ARISTOVNIK, Aleksander, MURKO, Eva, KRISTL, Nina, RAVŠELJ, Dejan. Disruptive technology capabilities in local governments : an empirical study. Information polity. 2025, vol. , no. , str. 1-20, ilustr. ISSN 1570-1255. https://journals.sagepub.com/doi/10.1177/15701255251321682, DOI: 10.1177/15701255251321682.
- BREZOVAR, Nejc. Use of artificial intelligence and other digital tools in participatory budget practice – empirical evidence from Slovenia. Lex localis : revija za lokalno samoupravo. [Spletna izd.]. 2025, vol. 23, no. 3, str. 194-209, ilustr. ISSN 1855-363X. https://lex-localis.org/index.php/LexLocalis/article/view/2773, DOI: 10.52152/23.3.194-209(2025).
- POŽAR, Ingrid, BAJROVIĆ, Fajko, UMEK, Lan, ŠURLAN POPOVIĆ, Katarina. Automated assessment of collateral circulation and infarct core : predictors of functional outcomes in acute ischemic stroke following endovascular thrombectomy. Neuroradiology. 2025, vol. , iss. [ahead of print], str. 1-11. ISSN 0028-3940. https://link.springer.com/article/10.1007/s00234-024-03519-4, DOI: 10.1007/s00234-024-03519-4.
- BREZOVAR, Nejc. The role of artificial intelligence in NGOs : challenges and opportunities for Slovenia’s information society. NISPAcee journal of public administration and policy. Jun. 2025, vol. 18, iss. 1, str. 11-30, graf. prikazi, tabele. ISSN 1338-4309. https://sciendo.com/article/10.2478/nispa-2025-0002, DOI: 10.2478/nispa-2025-0002.
- RAVŠELJ, Dejan, KERŽIČ, Damijana, TOMAŽEVIČ, Nina, UMEK, Lan, BREZOVAR, Nejc, ARISTOVNIK, Aleksander, et al. Higher education students’ perceptions of ChatGPT : a global study of early reactions. PloS one. 2025, vol. 20, iss. 2, [article no.] e0315011, str. 1-53, ilustr. ISSN 1932-6203. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0315011, DOI: 10.1371/journal.pone.0315011.
- BABŠEK, Matej, RAVŠELJ, Dejan, UMEK, Lan, ARISTOVNIK, Aleksander. Mapping the adoption of disruptive technologies in public administration : a bibliometric analysis and review of practical applications. SAGE open. 2025, vol. 15, iss. 2, str. 1-25, ilustr. ISSN 2158-2440. https://journals.sagepub.com/doi/10.1177/21582440251335516, dCOBISS, DOI: 10.1177/21582440251335516.
- MURKO, Eva, BABŠEK, Matej, ARISTOVNIK, Aleksander. Artificial intelligence and public governance models in socioeconomic welfare : some insights from Slovenia. Administraţie şi management public. 2024, iss. 43, str. 41-60, ilustr. ISSN 2559-6489. https://www.ramp.ase.ro/vol43/43-03.pdf, DOI: 10.24818/amp/2024.43-03.
- ARISTOVNIK, Aleksander, RAVŠELJ, Dejan, MURKO, Eva. Decoding the digital landscape : an empirically validated model for assessing digitalisation across public administration levels. Administrative sciences. 2024, vol. 14, iss. 3, str. 1-22, ilustr. ISSN 2076-3387. https://www.mdpi.com/2076-3387/14/3/41, dCOBISS, DOI: 10.3390/admsci14030041.

More Project References

Duration (from/to):

Contracting Authority:

Financing:

Members of the research group and links to the SICRIS portal:

Project phases and their realization

Project references