Samia Touileb

Position

Associate Professor, Natural Language Processing

Affiliation

Research groups

Short info

I am an Associate Professor in Natural Language Processing (NLP). I am also the co-leader of the NLP work package at the Research Centre for Responsible Media Technology & Innovation, MediaFutures. My main research interests are alignment, bias, and fairness in NLP.
Research

I am an Associate Professor in Natural Language Processing (NLP). I am also the co-leader of the NLP work package at the Research Centre for Responsible Media Technology & Innovation.  Prior to this I was a researcher in MediaFutures on Norwegian Language Technologies, and a Postdoc at the Language Technology Group (LTG), Department of Informatics, at the University of Oslo. I have a PhD in NLP from the University of Bergen, and have been working within research in and applications of NLP.

My main research interests are bias and fairness in NLP, alignment, information extraction, summarization, and applications of NLP and machine learning methods to tasks within social science research. I also mainly work on under- and mid-resourced languages such as Norwegian.

Outreach
  • Keynote at the AI Regulation and Governance: A Cross-Jurisdictional Approach conference, June 2024. Title of my keynote: Questioning the Machine: Decisions, Bias, and Fairness in AI.
  • Invited speaker at the Western Norway Film Fund Industry day (Vestnorske bransjedagar), May 2024. Title of my talk: Etiske problemstillingar rundt bruken av AI.
  • Invited talk at KPMG, Bergen, May 2024. Title of the session: Kurs: Etikk, sikkerhet og fremtiden med AI. Title of my talk: Etikk og AI.
  • Invited speaker at the Nordic Media Days (Nordiske mediedager), May 2024. Title of my talk: ChatGPT, Copilot og Bard – hvordan fungerer store språkmodeller?.
  • Invited speaker the Bergen Chamber of Commerce and Industry seminar, February 2024. Title of the seminar: Realiser AI potensialet på arbeidsplassen - Kan Copilot utløse mulighetene?. I gave a presentation entitled Begrensninger og etiske betraktinger rundt språkmodeller, and I was a member of a panel discussion. The seminars are directed to members from all companies and industries in the Bergen region.
  • Invited at Metis high school, Bergen, February 2024. Invited to give a talk at their Internet safety day. I gave a presentation about AI and large language models, their capabilities and weaknesses.
  • Speaker at Lærernes dag 2024, January 2024. Invited to give a talk at this years lecturer’s day at the University of Bergen. This yearly arrangement attracts teachers from the Bergen area to promote research. The title of my presentation was: Læring i språkmodellenes tid.
  • Invited speaker at the Norwegian Association for the protection of Industrial Property, November 2023. Title of the seminar: Ettermiddagsseminar om kunstig intelligens og immaterialrett. The seminar organised by opphavsrettsforeningen (Norsk Forening for Opphavsrett) The Norwegian Association for the protection of Industrial Property. Title of my talk: Hva er kunstig intelligens i stand til å skape og hva kan vi vente oss i nær fremtid?.
  • Interviewed for an article on forskning.no, October 2023. Interviewed by Øystein Rygg Haanæs to write an article published on the research platform forskning.no Title: Språkteknologi på villspor: Drømmer kvinner om å bli voldtatt?.
  • Invited talk at Sampol-Konferansen, October 2023. The Comparative Politics Conference, University of Bergen. Title: Big Science: Gullgruve eller fallgruve?.
  • Invited talk at The Norwegian Academy of Science and Letters*, September 2023. Det Norske Videnskaps-Akademi. Title: Store språkmodeller: muligheter og utfordringer.
  • Interviewed for TV 2’s impact on society report}, October 2023. Interviewed by Øystein Rygg Haanæs to for TV 2’s impact on society report to celebrate their 30 years as independant news providers in Norway. Title: Kunstig intelligens krever åpnehet og integritet.
  • op-ed piece in Morgenbladet, July 2023. Title: Chat GPT egner seg dårlig til eksamenssensuren. Authors: Pierre Lison (senior researcher at Norwegian ) and Samia Touileb.
  • Invited talk at a European Broadcasting Union (EBU) webinar, June 2023. Theme of the webinar: LLM Benchmarking Strategies. Title of my talk: Benchmarking the societal and ethical implications of large language models.
  • Invited talk at the Norwegian School of Economics (NHH), June 2023. Title: Demystifying ChatGPT and language models.
  • Invited talk at the Department of Mathematics, University of Bergen, May 2023. Title: Large Language models: What are they, and what are their ethical implications?.
  • Invited speaker at Future Week, June 2023. Title of the session: Innovation in the newsroom.
  • Invited panellist at Eilerts salong, organised by the University of Agder, May 2023. Title: Blir vi overflødige? En samtale om kunstig intelligens og utdanning.
  • Invited talk at norske dataforeningen, May 2023. Title: Sosiale og etiske utfordringer med språkmodeller som ChatGPT.
  • Invited to a podcast episode – Nordepodden, May 2023. Title of the episode: Når kunstig intelligens får ordet i sin makt.
  • op-ed piece in Medier 24, April 2023. Title: KI-dyret må mates med varsomhet. Authors: Samia Touileb and Per Christian Magnus (Leader of the center of investigative journalism SUJO). Check this for a short English summary.
  • Invited to a podcast episode – Abels tårn, March 2023. The episode was a small section of a forthcoming AI series. I talked about NLP and pre-trained language models, and discussed some of the ethical and societal issues.
  • Invited talk at the University of Stavanger, March 2023. Title: Når kunstig intelligens inntar redaksjonen, organised by Media City Bergen.
  • Invited talk at the Department of Archaeology, History, Cultural Studies and Religion, UiB, March 2023. Title: ChatGPT: teknologien, datasettet, og det vi (ikke) vet.
  • Invited speaker at its learning webinar, March 2023. Title of the webinar: ChatGPT and AI in education
  • Invited to a podcast episode – lektorlomsdalen podcast, February 2023. Title of the episode: ChatGPT og etiske perspektiver.
  • Invited speaker at Vestland Fylkesommune seminar about teaching, February 2023. Title: ChatGPT: teknologien, datasettet, og det vi (ikke) vet.
  • Invited speaker at the lecturers’ conference 2023 (University of Bergen), February 2023. Invited talk about large language models, their potential and drawbacks for use in education. Solstrand, Bergen.
  • Invited speaker and panellist at NORA ground-breaking seminar series, February 2023. Title of the talk: The Societal and Ethical Implications of Language Models. Title of the panel discussion: The Ethics of Large Language Models.
  • Invited speaker and panellist at UiB AI seminar series, February 2023. Title of the seminar: ChatGPT – trussel eller mulighet i forskning og utdanning?.
  • Invited talk at Wolftech, February 2023. Title: Measuring harmful and toxic representations in Scandinavian language models.
  • Invited speaker at ForskningsdageneUNG 2022, September 2022. I was invited to give an inspirational talk about my research to high school students during the research days, at the University of Bergen. I gave a presentation about NLP, and tried to motivate future students to seek a degree within informatics, NLP, or the broad field of AI.
  • Invited guest to a podcast episode, April 2022. I was invited as a guest to the podcast UiB Popviten. The podcast is the University of Bergen’s popular science podcast. The theme of the episode was the problematic aspects of black box machine learning models and how they can contain various types of biases and cause harmful outcomes when deployed. The podcast is in Norwegian, and can be listened to here or here.
  • Speaker at UiB AI seminar series, April 2022. Title: But, why? - make AI answer!. I presented and discussed interpretability of neural models.
  • Panel member (invited), March 2022. I was invited to be a member of panel discussion about AI, friend, foe, or fad at the annual Booster conference, in Bergen. Booster is a software conference organized for developers, project managers, architects, UX professionals, testers, and security professionals. The panel was led by Professor Marija Slavkovik, University of Bergen. The other panelists were Kevin Baum (university of Saarland) and Martin Gundersen (NRK – Norwegian Broadcasting Corporation). More information can be found here.
Teaching
  • INFO371Research Topics in Networks and Text Analysis (Spring 2022, Spring 2024, Spring 2025). Master’s course. Focus on Natural Language Processing, machine learning, and deep learning. Department of Information Science and Media Studies. University of Bergen.
  • DIGI117Natural Language Processing (Autumn 2023, Spring 2024, Autumn 2024, Spring 2025). New course I have developed for the University of Bergen’s Digital understanding, knowledge and competence (DIGI) course package. Department of Information Science and Media Studies. University of Bergen.
  • DIGI114Basic introduction to artificial intelligence (Spring 2025). Department of Information Science and Media Studies. University of Bergen.
  • Invited guest lecturer at the Norwegian School of Economics (NHH – Autumn 2023). Lecture for Executive MBA students about large language models, how they are currently used, how they can be used, and their ethical and societal impacts. The Norwegian School of Economics, NHH, Bergen.
  • AIKI100Examen facultatum - Introduction to Artificial Intelligence (Autumns 2021, 2022, 2023, 2024). Bachelor course. I gave an introductory lecture about Natural Language Processing. Department of Information Science and Media Studies. University of Bergen.
  • Invited guest lecturer MA8701Advanced statistical methods in inference and learning (15.03.2021). PhD course in statistics. Theme of the lecture: Analysing text with neural networks. Department of Mathematical Sciences. NTNU (Norwegian University of Science and Technology).
  • IN1140 – Introduction to Language Technology (Autumns 2017, 2018, 2019, and 2020). Bachelor course. Department of Informatics. University of Oslo.
  • Guest lecturer IN2110Methods in Language Technology (19.02.2019). Bachelor course. Theme of the lecture: lexical semantics and word vectors.
    Department of Informatics. University of Oslo.
  • INFO134Client programming (CSS3, HTML5, and JavaScript) (Spring 2017). Bachelor course. Department of Information Science and Media Studies. University of Bergen.
  • INFO103Information and Knowledge (Spring 2015). Bachelor course. Department of Information Science and Media Studies. University of Bergen.
     
Publications
Academic article
Academic anthology/Conference proceedings
Poster
Chapter
Academic chapter/article/Conference paper
Lecture
Popular scientific lecture
Interview
Feature article
Academic lecture
Doctoral dissertation

See a complete overview of publications in Cristin.

Projects

OPINION COST action: https://www.cost.eu/actions/CA21129/

MediaFutures: https://mediafutures.no/2021/01/20/postdoc-samia-touileb/

NorDial: https://github.com/jerbarnes/nordial