The KIB podcast
The KIB podcast is a podcast from the University Library where we tackle topics related to the scientific publishing landscape and research infrastructure.
Episode 1: Your questions – our answers: Library people talk about AI
In this episode of the KIB podcast Alvin, bibliometric analyst, Lovisa, librarian, and Anna, writing instructor, discuss questions from colleagues and students at KI on LLMs (large language models).
- Does AI think?
- Does AI know everything?
- Can I trust "my friend Bob" the AI?
Related information
Transcript of the KIB podcast episode 1
This is a transcript of the KIB podcast episode 1. The transcript is made by AI (there might be minor mistakes).
ALVIN
Welcome to the KIB podcast, the podcast of the Karolinska Institutet Library. I'm Alvin Gavel, Bibliometric Analyst.
LOVISA
I'm Lovisa Liljegren, Librarian.
ANNA
And I'm Anna Borgström, Writing Instructor.
ALVIN
This episode is going to be about large language models. Most of it is going to be a discussion where we try to answer questions that the library has received repeatedly from people at Karolinska Institutet. So we have a little stack of questions here that people keep sending to us every once in a while.
LOVISA
Yes, but I think that maybe we can start with, I found this tool called Notebook LM from Google that creates something that sounds like a podcast. You can give it papers, you can give it web pages and so on. So I gave it the staff pages about us. So I gave it a staff page on you, Anna. I gave it a staff page on you, Alvin, and on me. And I think we can listen to the output and then we can talk a little bit about that.
NB1
You mentioned we have profiles of three KIB staff members.
NB2
Yes, we do. Let's start with Alvin Gavel, a Bibliometric Analyst with a background that's, well, frankly, out of this world.
NB1
Oh, tell me more.
NB2
His research background is in astronomy, astrophysics, and cosmology pedagogy.
NB1
Wow.
NB2
It seems his interests lie among the stars. He's published work on topics like the history of astronomy education and even the use of virtual reality in teaching astronomy.
NB1
That's amazing. I'd love to pick his brain about how his experience in those fields informs his work with bibliometrics at KIB. Right. It must give him a unique perspective on the evaluation of knowledge and the impact of scientific discovery.
NB2
I would imagine. So now let's move on to Lovisa Liljegren, a librarian.
NB1
Okay.
NB2
Who plays a key role in teaching information searching and reference management. Skills that are absolutely essential for any researcher, especially in the digital age.
NB1
Especially with the sheer volume of research being published these days.
NB2
Right. It must be like trying to drink from a firehose.
NB1
It really is.
NB2
What's particularly interesting about Lovisa is her special interest in AI and information literacy.
NB1
That's incredibly timely. With AI tools like ChatGPT becoming increasingly prevalent, it's more important than ever for students and researchers to understand how to use them effectively and responsibly.
NB2
Right. It's not just about using the tools themselves. It's about critically evaluating the information they generate. And understanding their limitations. Lovisa is clearly at the forefront of this emerging field.
NB1
And she's based in Flemmingsberg, but also available in Solna and via Zoom.
NB2
Yeah. Talk about meeting people where they are. And the last person we have a profile on is Anna Borgström, a writing instructor. Anna is clearly passionate about helping students develop their academic writing skills. She teaches group workshops, offers individual consultations, and has even presented at conferences on the topic. But what really struck me was her recent focus on teaching students how to use sources effectively, avoid plagiarism, and navigate the ethical considerations of using AI in writing.
NB1
That's really fascinating. It speaks to KIB's commitment to not just teaching practical skills.
LOVISA
Okay. So this was Google's Notebook LM.
ANNA
Wow. That's interesting. I didn't know all this about you, Alvin.
ALVIN
Well, this is where it gets weird because that was sort of half true and half insane lies. Because I have worked in astronomy and teaching. I haven't really mixed those. I don't recall teaching the history of astronomy education ever. I also haven't done anything with VR in astronomy teaching. I've worn a VR headset for a little while one time. I hated it because I got seasick. That's kind of my entire contact with the VR field. So what we had there is a pretty clear case of what people tend to call hallucinations in generative AI. I don't really like that term because it's not actually, you know, qualitatively different from what the model does at any other time. And I know you, Anna, also don't like it for several reasons.
ANNA
I don't like it because I think it sounds as though we think these machines have consciousness and that they're doing something or that they have a purpose of doing something. And I think that gives us a weird sense about them. Alvin, would you like to explain a little bit more about large language models?
ALVIN
I mean, you can get really far just by understanding that they are literally what the name says. A large language model is a large model of language. Like they're purely statistical models of what human language tends to look like. And with this particular architecture that's behind, for example ChatGPT, they essentially try to predict, given a string of words, what should the next word be? I'm simplifying a little bit here. Some of you listening might be yelling that, no, no, no, you don't mean words, you mean tokens, but whatever. This gives you the basic idea of what it does. And it so happens that if you have a really, really good statistical model of language, you have a model that will output language that happens to most of the time be true statements. Like it's more likely to output the sky is blue than the sky is green. Or fire is hot is more likely than fire is cold. That's kind of as far as it goes. Like it doesn't know what the color blue is, or what hot is. It only knows that these words tend to go together in a particular way.
ANNA
Yeah, but then I mean our presentations, they were a little bit funny, right? And super positive of course, and that's what you often get from these tools. But ours were a little bit more true, so to speak.
LOVISA
Yeah, and we need to know that the data that we gave this is that on our staff pages, Anna, we have written more about ourselves than Alvin has. On Alvin's staff page, you can find articles that he has published, but you haven't written a text about you, which we have done. So of course, there is more data from us and less to make things up from.
ALVIN
Yeah, which meant that it, well, I would say that interpolated, I prefer that to saying hallucinated. It's just given what had been put into the context that I had been working with astronomy and teaching, sentences like he has been working in education of astronomy throughout history and so on, those are plausible sentences. They don't surprise you that they're there unless you're me and go, what? When did I do that? Obviously, then you'll be a bit confused. And that tends to be the case most of the time when you'd say that a language model hallucinates, that it says something that's, yeah, I mean, sounds reasonable and sounding reasonable is kind of all that actually encoded in the statistical model.
ANNA
Yeah, and I tend to think about these models as a little bit like the kids pastime of connecting dots. You know, when you have dots with numbers and you're supposed to connect every dot and you get a picture. And that picture is, I mean, it often looks like something. It doesn't look exactly like the animal that you're supposed to, what the animal actually looks like, but it looks okay. But then I think if those dots are a little bit more sparse, you may miss something like a tail or an ear. And I think, is it true perhaps that since we had less information on you Alvin, to give to the large language model, that it kind of tried to connect some dots in a statistically likely manner, but then made some connections that were in fact not the same as the true situation.
ALVIN
Yeah, I suppose. Yeah, you could look at it that way. And of course, always keeping in mind that it never ever tries to say true things. It only tries to make, you know, statistically plausible sentences.
ANNA
Yes, exactly. And it doesn't have a mind of its own. It just does what it's programmed to do.
LOVISA
And I think maybe we can start with the first question. And that is about getting feedback from a large language model. So I can read the question maybe.
ANNA
Yeah, go ahead.
LOVISA
So I wrote an introduction to my text and I asked an LLM to improve the text and it got so much better. Can I use the LLM's version instead?
ANNA
Well, I'm happy you found the large language model useful, I would say. But I would say you have to be super cautious about getting feedback from these tools. And one reason is that, well, you will often get suggestions or rather rewrites of the text. And these rewrites, they're not based on, you know, rules about language or rules about or facts about the content, but they're just based on statistical significance. So that means, for example, if you've written treatment effectiveness and you get a rewritten text with treatment efficacy, these two are very much different in a medical context. However, these large language tools, they've been trained more on everyday language and there is a difference between effectiveness and efficacy. But the difference is not as pronounced as it is in a medical context. So, well, you can get feedback and sometimes you get a nicer flow. People often feel they get a nicer flow. But you may also get a text that lacks in precision, which is of course not what you want. So you have to check everything super carefully.
ANNA
And here I would say that don't fall asleep at the wheel. Meaning that if you just have one paragraph, you have to be super attentive and check everything so nothing changes. But, you know, if you just think, oh, this is good and you put in a bigger chunk of text, it's so easy just to lose track of what you're doing and you just think everything sounds good. So, well, you can use it, but you have to be super cautious. And I would say that some of the things you can use it for is brainstorming, like getting ideas or getting some feedback. But it may be better to ask for feedback than to get a rewrite.
LOVISA
Yeah, and of course to be able to evaluate this, you need to have some kind of background knowledge. You need to know that the sentence actually gets better and improved. You need to be able to see that yourself and be confident with that.
ALVIN
Yeah, I'd also agree with all of that. But I'd also add, this isn't actually unique to language models. I would say exactly the same thing with your human co-authors if you're writing an article, for example. Because you'll discover they'll also do weird little things like replace the word method with methodology because they think it means the same thing and so on. That sort of thing happens when you let people change texts in general.
ANNA
Yeah, and I would like to say that there are some tools that you can use. That, for example, Grammarly, the main function of Grammarly or InstaText. Because if you use those tools and not large language models, instead of getting a rewrite that's based on statistical likelihood, you get just suggestions. And those suggestions are based on grammatical rules or rules about concision, precision, level of style, for example. So there's a pretty big difference there. But I mean, you can use them, but I would not just go for the ChatGPT version just because it looks better on the surface. Can I use large language models to summarize texts? Because we all know that people are stressed out and having to read so many articles. So a lot of people seem to have taken the habit of using these to summarize.
LOVISA
Yes, and this is something we hear a lot about, that people use these tools to summarize text. So, can you?
ALVIN
I would strongly discourage doing that. Large language models, they can shorten texts but they kind of can't actually summarize them. And that might actually be hard to notice unless you read very closely what comes out of it. Because the problem is, if you give ChatGPT or whatever a long text and you say, summarize this for me, it will say, yes, of course. And then you will get something back that is short, contains many keywords from the longer text and that stylistically looks like a summary. Because it can do all of those things. It can recognize that these words occur here, so they should probably be there. It recognizes what a summary tends to look like. It doesn't actually understand what the original text is about, which means it doesn't understand what's significant. So it doesn't know what part to pick out and what parts it can throw away. So it's kind of just going to throw away things a bit at random and give you a shorter text, which might reflect the original text and it might not.
LOVISA
Yeah, and we've seen this many times, like how the summary is sometimes correct, sometimes it makes things up. It includes things that is not true, that is not written in the original. But even if the summary or the shortened text is correct, we've seen that it misses the core, the core of the original text. So the core is completely removed.
ANNA
Yeah, I've talked to so many people who have actually used these to summarize texts and they have been kind of happy about what it does. But I would say that what they do is that they ask for a summary first because they think it's kind of daunting to read a long research article. So they read the summary first and then they, after they know a little bit about the topic, they read the entire article and they feel a little bit more prepared. But I guess it's really important that you understand that you're not really getting a summary. But maybe you're just getting some kind of, you know, introduction to the field and then you have to read thinking about was this a correct summary.
ALVIN
Yeah, I'd also add the article probably has an abstract in it already. You can just read that.
LOVISA
Yeah, exactly. What is the point of these shortened texts when you have the abstract already?
ANNA
Another thing, this is quite related to the previous question, is about large language models that translate language like DeepL.
Is it okay to use DeepL? People think it works so well.
ALVIN
I would say sure, go ahead. Like I do think those are genuinely useful and I'm usually kind of negative otherwise about different types of language models. That said, the thing you have to keep at the back of your head is the sort of more even stronger version of what Lovisa said earlier. That how do you tell that a translation is any good? If I take a text in Finnish, translate to Swedish, well, I can look at the final product and see that, hey, this is probably a well-written text. But I don't know Finnish. I can't tell if it's actually a correct translation or if something terrible happened along the way.
LOVISA
But this is of course something you need to have a level of trust and you have to, if you have a translator, like a person who translates a text, you need to have a level of trust in that person as well.
ALVIN
Yeah, sure. But if you ask me to translate a text, you can get mad at me afterwards and possibly fire me if I do a bad job.
But what do you do when a language model screws up?
ANNA
Yeah, because I think that could be the consequence, right? That if we would write an email to someone who doesn't master Swedish or English and we just ask DeepL or a similar tool to translate the text, we could end up having major misunderstandings or even someone could become upset. Because I think also a professional translator would have some kind of understanding of the cultural context as well. Which of course a large language model probably doesn't.
ALVIN
It might to some extent because to some extent, culture actually is encoded in the statistical properties of language. But you don't know how much it understands.
ANNA
So I guess it's a little bit dangerous.
LOVISA
But still we all agree here that you need to have some kind of background knowledge to be able to evaluate what you get out and the outputs that you get.
ANNA
Yeah, and I think here too it's kind of easy to fall asleep at the wheel. Because you do see the output that's so great. So you tend to trust these tools more and more. But don't do that. You have to really know if the translation is okay or not.
Lovisa, there are some questions too about using AI to find articles, research articles. What do you think about that?
LOVISA
Yeah, and when ChatGPT came, that was something that libraries around the world had to deal with, that library patron came with questions about articles that they have received from ChatGPT, but that they couldn't find anywhere. They had tried Google Scholar, they had tried different databases, but they couldn't find these articles. And I think now I think most people do know that large language models, they can, so to speak, hallucinate articles or make articles up that do not exist. Even though ChatGPT has become better now. But we need to understand that it is a large language model and it's not a search engine, it's not a database.
ANNA
Yeah, I think I've found that one of the most useful ways of explaining large language models, when you talk about these fake references that were generated, which is, I mean, it is a fake reference in the sense that it doesn't work, but it doesn't exist. But the thing is that these have been generated based on statistical likelihood. So basically, if you ask about an article on this topic, you will get a title that ChatGPT or similar would deem likely that someone would have written. If they wrote about a topic, they could have written an article with this title. And if someone wrote an article with this title, it's likely that they would have written it with this author. They didn't, but it's likely. And it goes on like that.
ALVIN
I'd also add that you can integrate a database or a search engine into a large language model. There's a thing called RAG, retrieval augmented generation, for example, where you essentially, if you ask the model for, I don't know, an explanation of Newton's laws or something, it will check a database, find a text about Newton's laws, drop that into the context. window of the model. And what comes out will probably be true, like it's more likely that what you get is something true. But at the end, it doesn't remove the problem. Even when you do that sort of thing, it's still just fundamentally trying to figure out which words should follow which other words. It's just that hidden to you, it has some extra words that actually are true. And hopefully, it's going to sort of repeat the substance of those words. But still, it's not a physicist, it doesn't actually understand Newton's laws.
LOVISA
Yeah. And these tools that you are talking about are sometimes called AI search tool or large language models search tools. And there are several of those out there like illicit perplexity, and there are new tools coming all the time that works in the same way. And what usually happens is that you write a prompt into these tools, and then they work in a different way. Some of them translate that prompt that you write into a search query using so-called lexical search, the same traditional as you do in a traditional database. So it will pick up keywords and then search using those keywords. So the search is quite similar also to how you do it in a database. But then there are also tools that use semantic search, which is that it can understand the words in another way. It can understand the context of the words and so on. And what you usually get is the top five articles in these tools. And when you try different tools, you will realize that the top five articles are not the same in the different tools.
And it's not really transparent why you get those top five articles and why you don't get some other articles. So this is something that we usually say that you should be aware of when you're using these search tools that they are not transparent and they are not reproducible either. Like if you use the same prompt a week from now, you will probably get other articles. So you need to be careful to use these tools. You can of course use them as a starting point or as a complementary search, but you should probably not stick to one of these tools. You should probably try to use several of them or also do a traditional search in a traditional database. And when you're searching in a traditional database, what you get is that you get a list of articles. You don't get a summary of the different articles, but you get a list of all the articles that match the search query that you put in. And that is not the case when you're using these AI search tools.
ANNA
So I'm curious. So if you find a large language model that will give you an answer to the question that you ask it and you do get a lot of references in the text. And if I would then check up all those references and they seem okay, can I do that?
LOVISA
There is also a risk of confirmation bias, of course, that if you, for example, there are tools that that you can use when you write a text and then you can just use that text and trying to find a reference that will work to that text. But of course, then you don't know what's out there. You don't know what you're missing. You don't know. So that's something that you should also be aware of.
ANNA
It looks kind of like these are more reliable, but they're also a little bit more sneaky perhaps. In that sense, yeah.
ALVIN
Also, there's the simple problem that it might give you an answer if you're asking something and have a bunch of references to that answer. And if you take the time to actually read them, it turns out that no, they say no such thing whatsoever. That's quite common. And of course, that's not just limited to large language models. If you've ever done a literature search, we read articles, read the articles that the first article referred to and see how well each particular article actually managed to summarize what the other one was about. You notice human beings are also pretty bad at this.
LOVISA
But something I also think you should be aware of when you're using these AI search tools is to know where you're searching because some of these tools are integrated with, for example, Semantic Scholar, which is a so-called mega database. You search in a large, large database. And you can compare that to, for example, PubMed, which has been indexed with specific journals and they have been quality assessed that they can be indexed into PubMed. And that, of course, is also different from, for example, Google Scholar, which is an academic search engine that will just find academic articles or articles that look like academic articles, but that can be, for example, there was recently an article was published where they found lots of GPT fabricated journals in Google Scholar. So these articles were just made up. They did exist, but they were just written by a large language model.
ANNA
So the research behind it didn't exist.
LOVISA
Yes, exactly.
ALVIN
And I should say that's such a big topic in itself. I don't know, maybe we'll just make an episode about that one.
ANNA
There's another question linked to this. I found a blog post that's about my area of research and I was so intrigued. It mentions the article that produced the science but it doesn't give a proper reference and I haven't been able to find the reference. Could you help me with that?
LOVISA
Yes and this was a question that we did get at the library some time ago. We tried to find the article behind the blog post. We did some searches in Google Scholar, we did some searches in different databases but we couldn't find the article. And then we did eventually find it was very hidden that this web page was, it said, generated by AI or powered by AI, it said. So the blog post was probably completely made up but of course powered by AI can mean lots of different things. We don't really know what that means. Does it mean that it's a large language model that is just made this up and that the article does not exist or is it some kind of rag in this that the article does exist in some kind of sense? Most likely the article does not exist. But it was interesting because this also shows how hard it can be to actually evaluate when it comes to things generated by AI. Because even as a researcher it can be it can be hard to know that this is completely made up because it looks so good, it looks so true, it looks so like this could have been an article that was written.
ANNA
That is sneaky.
ALVIN
Yeah and unfortunately I think that's going to be more and more common because well this is an easy way to make money. Just take stuff from a language model and publish it somewhere.
LOVISA
Yeah exactly.
ANNA
And then those texts will be used to train new language models. That's a scary thought I think.
LOVISA
It is scary.
ALVIN
Yeah sometimes here people use the phrase Habsburg AI because the problem is that AIs they kind of have to be trained on actual human generated texts. If you feed AI with AI output you kind of get a form of inbreeding in a manner of speaking. So eventually they just start producing nonsense. And this is a huge problem for the AI companies now more or less irrespective of what we're talking about. Because if you do a Google picture search you're also going to get a bunch of stuff that doesn't exist because it's generated with GAI. So those are also going to well they'll be harder to train. It will be harder to find actual good data to train them with.
ANNA
Well data is the new gold. Is that what you normally say Lovisa?
LOVISA
Yes. So you should be careful what you put out there.
ANNA
Yeah. I was just going to say that we have a question about that too. Can I use generative AI? Is it safe or can my data be found and used by other people? And what about sensitive information?
LOVISA
We usually say that you should be careful about what data you're putting in there. You can also be aware that there are some settings. For example in ChatGPT there is a setting that says that you don't want the model to use the input that you give to train new models. But of course you should always be careful and maybe not trust this.
ALVIN
Here's where I would introduce a really good rule of thumb called the My Friend Bob rule. Nobody in Sweden is named Bob so I guess we'd call it Min Kompis Leif Regeln or something like that. The basic idea in a way is that if you want some quick way of figuring out whether it's a really really bad idea to do something using an AI. Just in your head try replacing AI with My Friend Bob. If I say I had some trouble getting started writing this so I asked my friend Bob for some ideas and he was pretty helpful I managed to get started. That's a reasonable thing to say. If I say I asked my friend Bob to diagnose this patient. That's probably professional malpractice and may literally be a crime. So don't do it.
ANNA
Yeah so you're also saying that secrets you wouldn't tell your friend Bob you shouldn't share with.
LOVISA
Yeah because you never know what he's going to do with it.
ANNA
No that that's correct. So we're kind of approaching another question. Can I use generative AI?
LOVISA
Yes and that depends on how you're using it of course. But when it comes to academia there are many journals now that have different kinds of policies how you should disclose the use of AI.
ANNA
Thank god. You remember in the beginning right where we would see ChatGPT as a co-author?
LOVISA
Yeah.
ANNA
So those guidelines, they are quite different still, aren't they?
LOVISA
They are. And some journals want you to write in the method section how you used generative AI. And some want you to have an acknowledgment. And some want you to have an AI disclosure. So it's very different. We haven't really figured it out.
ANNA
No, but I think we're getting there. We don't have these crazy things like having ChatGPT as a co-author or someone thanking the developers at OpenAI in their acknowledgment section. So we're getting somewhere here. And I think if you look at these different policies, at least, there are some things that they all have in common. Transparency. You have to be very clear about how you use these tools. And accountability. You're always responsible for everything in the text. And then I think also that they all seem to be a little bit restrictive on how much you can use generative AI because of course they don't want to publish articles that have been just generated by generative AI tools. And I think here too that now we talked about journals. But if you're a student or a doctoral student, you should check with your teachers. And I think that not everyone may have figured everything out yet. But we should try to be as transparent as possible and as curious as possible about this. And always ask.
And I would say for students I always recommend having some kind of AI reflection. So that it isn't just a statement of how you used AI but that there's always this element of reflection. How did you use AI? How did it work? How did it not work? What did you learn? Because you have to think about that all the time. Can you really say that you're fulfilling the criteria for your degree if you use generative AI in this way? And I think there too my friend Bob can be very useful. I mean you can ask your friend Bob to test out some ideas. But you can't ask your friend Bob to write and finish your assignment for you. Of course your teacher wouldn't be happy with that. But I do think it's very nice to have this reflection and this transparency there too. Because I also think that if we're super transparent about our use as students, we should not risk being accused of cheating. Because cheating requires some intent. So if you're transparent about your use, yeah you shouldn't be accused of cheating.
So I think that creates a safe space which you can then enforce as a teacher through encouraging open dialogues.
ALVIN
You kind of touched on a slightly larger problem there when you mentioned that you're responsible for what's in your articles. This has been a problem before large language models entered the picture. For a very long time in many fields there has been kind of a custom of gift authorship. That somehow I owe you for something so I add you as a co-author to my paper. Which you will see as a nice thing to do. And then it turns out that my paper is really bad. At which point you might feel that no, I haven't even read this paper. How can I be responsible for it? But the problem is that yeah, if you are an author, you're supposed to be able to stand behind every single word of the article.
ANNA
Yeah, I've seen retractions based on that very scenario that people didn't know that they were co-authors and weren't very happy about it. So my impression is that publishers are a little bit harsher on that now because they don't want drama like that.
ALVIN
So I'm not the one who receives most of these requests, but I have gotten a question about how can I use AI to analyze data? The short answer is don't use ChatGPT or any other language model to analyze anything whatsoever, it will end horribly. That said, you can do really interesting things with machine learning algorithms. Step one is to find a programmer who actually understands machine learning and explain to that person exactly what it is you want to do. And they can probably, hopefully, do something really useful. Do that before you have your data, don't do it after. And that's true in general, even if you just want to do a basic statistical analysis. Talk to a statistician before you get your data. There is even a quote I've had reason to give several times, which is by R.A. Fisher saying, to consult the statistician after an experiment is finished is often merely to ask him to conduct a post-mortem examination. He can perhaps say what the experiment died of." So that's important to keep in mind.
If you're in the situation that you have a bunch of data and you're now thinking about how can I analyze it, how can I analyze this, things have gone wrong. So that's my one question. Do you have any more?
LOVISA
Well, there are many more things to talk about, of course. Now we haven't talked about, for example, open science, open data when it comes to the age of large language models.
ALVIN
Yeah, and I think we should maybe start try to wrap up. But also, now we've been trying to answer people's questions, now I just want to check, do we have any questions of our own that we genuinely don't know, but we've been thinking about this?
LOVISA
There has been a huge hype around large language models lately, ever since ChatGPT. And I'm just thinking, are they here to stay or will we see kind of decline of them? What do you think? What is your thoughts on this?
ANNA
To me, I mean, yeah, they don't seem as hyped anymore. And I meet with a lot of students and they don't seem as excited about them. But they do seem to take them for granted and they do use them, often wisely, but they do seem to take them for granted. So I think the hype may be over, but I'm not sure that I think that large language models will disappear. What do you think, Alvin?
ALVIN
Kind of the same. Like I don't foresee ever saying large language models are like zeppelins, like they're gone now. I don't think that will happen. That said, I think they'll be used a lot more rarely because now all of these services are more or less free. That's not sustainable. OpenAI are burning money all the time. They lose money every time you use ChatGPT. At some point they'll have to stop doing that. They'll have to switch to some kind of subscription based model. And once you look at how expensive it is, I don't think you get so much use of ChatGPT that you will be willing to pay for it. Large language models will be used less and also because, well, they don't really have that much in the way of clearly defined use cases that you could put into, you know, actual industrial use. So they'll mostly just be things you play around with.
ANNA
And I guess there's also the environmental aspect that powering these machines takes so much electricity and cooling down these servers takes so much water. So it's not really sustainable.
LOVISA
And I also agree with both of you about the future of large language models. And I think when it comes to searching, I think what we will see more of is semantic search. And maybe not as much the large language models search tools, but we'll see.
ANNA
That's interesting.
ALVIN
Right. So I guess my question would be examinations.
Like for a really long time we have been telling students, write a thing and hand it in by Friday. And there's been a decent chance that the student actually wrote what they handed in. Today there is, well, now everybody can use just a large language model to write for them. So will we have to change our format for evaluating student performance entirely? Or do you think we can sort of adapt what we're already doing?
ANNA
Well, I think that's a wonderful question. And I do think some teachers were understandably scared when ChatGPT was launched. And they tried to just avoid the issue by giving oral exams or by having just exams with no internet access. But I don't think that's sustainable. I do think that we're going to have to teach our students to use generative AI responsibly. And I do also think that students need to be able to write. And another thing, I liked how you started this question, Alvin. Because I think for too long we have kind of equaled a well-written text with a passing or passed with distinction grade. And now we can't. And I think it's kind of reasonable that we can't. And I think this kind of forces us to ask questions such as what is really the purpose of going to university? What do you as a student have to show to your teachers that you know? What do your teachers need to understand in order to assess you in a fair way? So I think these questions that we haven't talked so much about, I think when we talk about generative AI, this is our chance to return to these super important questions about education.
LOVISA
I think that's a nice way to wrap this up. But because I also think this question about examination is something that when we started going out and talk to researchers and teachers as KI, it was the most common question that we got. This question about examination and how to deal with that.
ANNA
Yeah, and I think it's important to just note that so far, there are no AI detectors that are really reliable. Those that have been in use have been proven to over identify text as AI generated when the author is a non native speaker of English, which of course isn't fair. And also I don't know what it tells you that someone used generative AI to write the text. Maybe they used it in an okay way. It doesn't really tell you anything. And if we compare this to if you have a non plagiarism tool, anti plagiarism tool, I mean that gives you the possibility to compare text to something else. And you will have a clear record of, okay, this is exactly the same text. So you can assess it whereas there are no traces like that with generative AI. So it's a completely different situation.
ALVIN
Yeah, and I could add that I don't think there ever will be reliable AI detectors. The reason why I think that is a bit technical, so I'm just going to leave it out and say this is my bet about how the future will play out.
ANNA
A cliffhanger.
ALVIN
All right, do you have any like, I don't know, shout outs, like recommendations to other podcasts or papers you've read or blogs or whatever that you would like to direct listeners to that, hey, here's a good place to learn more about this topic?
LOVISA
When it comes to AI in searching, I would recommend, there are many things to recommend, but if I only choose one, I would recommend Arante's blog about searching. And he writes a lot about AI search tools and the problems with them and how you can use them in a responsible way. You don't have to, Anna. You don't have to.
ANNA
Alvin, I'm sure you have enough tips for all of us.
ALVIN
Yeah, I actually have lots of recommendations. So there will be a little Zotero library in the vicinity of wherever this podcast is uploaded where you can look up some things that we thought were relevant to look at. But off the top of my head, I would recommend a blog named Pivot to AI by Amy Castor and David Gerrard. David Gerrard is also the author of the book Attack of the 50-Float Blockchain, which is really good. Because that blog gives you a lot of the economic and social context to why the AI field currently looks the way it does. Because there are many things that kind of don't make sense if you only look at it from a technical point of view. This is going to sound weird, but in general, I recommend reading bitter tech bloggers who swear a lot if you want to understand the AI field. They're often the best source available and that's not a joke, that's just factually true.
ANNA
Yeah, well, I would say that my best advice would probably be to look at a reputable journal within your field. I look at Nature's homepage every now and then to see what they've written recently about generative AI. Which is interesting, not because they're experts, but it's interesting to see what concerns are hot right now. You know, what people are talking about right now in the field.
LOVISA
Yeah, I agree.
ALVIN
Okay, so then I guess we're done.
LOVISA
Yeah, do the outro, Alvin.
ALVIN
Yes, I'm outroing away. Thank you for listening to our FAQ on large language models. There will probably be more episodes about AI in the future, focusing on how the field of AI affects the library world. The podcast is made for the Karolinska Institutet Library and the responsible publisher is Miriam Nauri. I've been Alvin Gavel.
LOVISA
I'm Lovisa Liljegren.
ANNA
And I'm Anna Borgström.
ALVIN
You can find out more about us and the library at kib.ki.se. There you can also find other episodes of this podcast and information about upcoming episodes.
Pilot: We talk about systematic reviews with Wim
In the pilot episode, we meet Wim Grooten, associate professor, researcher and lecturer. We talk about systematic reviews, a form of literature review that aims to synthesise all available research or evidence in a defined area using a specific methodology.
- What is most difficult about writing a systematic review?
- Can AI be used in the process?
- Don't forget your PICO!
This episode is in Swedish.
If you would like us to get back to you, please submit your contact information in the form below along with your feeback.