Humans vs LLMs in Idea Generation: What the Experts Say

Explore how AI and humans compare in idea generation. Uncovering insights from the latest study from experts on LLM research ideation.

Greystack Technologies
6 min readOct 2, 2024

Idea generation is a core function of artificial intelligence. Particularly in large language models (LLMs) that are increasingly adopted in enterprise settings.

AI continues to evolve and businesses are leveraging these models to automate tasks, improve decision-making, and assist in generating ideas.

However, a critical question remains: Can LLMs truly match or surpass human creativity when it comes to idea generation?

A recent study, Can LLMs Generate Novel Research Ideas?, conducted by Chenglei Si et al., addresses this question head-on.

They organized a large-scale experiment with 100+ researchers to compare an AI ideation agent’s novel research ideas against human experts.

The findings reveal important insights into the potential, limitations, and future of AI-driven idea generation in research.

Let’s explore!

LLM and Idea Generation

The advent of LLMs has revolutionized how humans interact with machines.

Trained on vast amounts of text data, models can generate coherent, contextually appropriate responses across a wide range of topics.

Idea generation, however, represents a higher level of cognitive engagement, where creativity, novelty, and feasibility must converge.

While LLMs excel at generating content, evaluating whether they can produce truly novel, expert-level ideas are more complex than expected.

The Study Setup

LLM Idea Generation study setup

The study was set out to answer whether LLMs can generate innovative research ideas. It focused on the domain of natural language processing (NLP).

To guarantee fairness, they designed the experiment where both human-generated and AI-generated ideas were blind-reviewed by highly qualified researchers.

Each participant was asked to generate ideas on specific NLP topics, and the AI model was given the same prompts.

Human experts then evaluated the novelty and feasibility of both sets of ideas, without knowing their origin.

The goal was not only to compare creativity but also to assess the practical applicability of these ideas.

They aimed to see whether LLM-generated ideas could spark innovation or at least complement human ideation processes.

Let’s see what they found;

Key Findings

Idea Generation study key findings

LLM Ideas Were More Novel: Interestingly enough, LLM-generated ideas were judged to be more novel than those generated by human experts.

The statistical analysis revealed a significant difference that suggests AI models may bring a fresh perspective that challenges conventional thinking.

Feasibility Lagged Behind: Despite the novelty of the LLM-generated ideas, their feasibility was slightly weaker compared to human-generated ideas.

While AI was able to introduce new concepts, it sometimes struggled to align those ideas with practical, real-world applications.

Bias in Idea Generation: Another key finding was that the LLM’s self-evaluation was less reliable, often overestimating the feasibility of its ideas.

This points to an important limitation in using AI for ideation: the lack of critical self-assessment capabilities.

Diversity in Idea Output: Lastly, the study also highlighted that the diversity of ideas generated by LLMs was relatively limited.

LLMs tend to follow patterns based on their training data which can limit the scope of creativity in certain domains.

Challenges

While the results show us promise, several challenges stand in the way of LLMs becoming reliable research ideation tools:

LLM Idea Generation study challenges encountered

Evaluation of Novelty and Feasibility: The difficulty of assessing novelty is evident even among human experts.

It was found that human judgments of novelty were subjective and inconsistent, making it harder to evaluate LLM ideas definitively.

More objective-based measures for creativity evaluation will be required for LLMs to be integrated into ideation processes.

Lack of Self-Evaluation in LLM: Another significant challenge is the failure of LLMs to self-evaluate their output accurately.

Unlike human researchers, AI tends to overlook practical constraints which raises the need for improving the model’s self-assessment algorithms.

Lack of Diversity: The range of generated ideas is notably limited. Without the ability to draw from a more diverse set of concepts, LLMs risk reinforcing existing biases.

What does this mean for you?

Implications for Businesses

For businesses, especially if you’re focusing on research and development or innovation, the implications of this study are profound:

  1. Fostering Innovation — By integrating LLMs into your research workflows, your business can foster a more innovative culture.
  2. Optimizing Research Processes — LLMs offer the potential to streamline your ideation processes and generate a wide range of possibilities in less time.
    For your business, this can translate to more efficient R&D cycles. Allowing you to explore more ideas without the constraint of human labor.
  3. AI as a Research Assistant — You wouldn’t need to replace humans. AI can serve as a complementary tool that assists in exploring novel ideas.
    Automating parts of the ideation process can allow human experts to focus on evaluating and refining those ideas.

For Professionals

The implications of this study extend beyond academia.

Experts such as research scientists, marketing strategists, or content creators, LLMs may help them with their ideation processes.

Here are a few things to take note of:

  1. Augmenting Human Creativity — LLMs can act as creative partners that offer you fresh perspectives that you might not have considered.
  2. Combining AI and Human Expertise — Although LLMs can generate novel ideas, these often need human refinement to guarantee feasibility.
    Professionals such as scientists and engineers, can take AI-generated concepts and apply their expertise to refine them to real-world constraints.
  3. Critical Review of AI Output — Given the limitations in the AI’s feasibility assessment, professionals in creative fields must critically evaluate the output.

Considerations

When considering the adoption of LLMs for ideation, you must be aware of these factors:

  • Human Oversight is Necessary — Since LLMs struggle with feasibility evaluation, human experts remain critical in assessing AI-generated ideas.
  • LLMs Should Be Used as a Complementary Tool — LLMs lack diversity in their outputs. It’s recommended to use them in tandem with human creativity. They should serve as a tool to enhance, rather than replace, human ideation.
  • Training and Bias — The output of LLMs is heavily influenced by their training data. Therefore, businesses must ensure models are trained on diverse, high-quality datasets to prevent the generation of biased or repetitive ideas.
  • Ethical and Practical Concerns — As with any AI system, ethical considerations must be taken into account. Particularly, regarding the use of AI in sensitive or high-stakes domains such as scientific research.

LLMs are proving to be valuable tools in the world of idea generation. However, their limitations in terms of feasibility and self-evaluation highlight the importance of human involvement.

The study served as a critical step in helping us understand how LLMs can augment, rather than replace, human researchers.

As AI continues to evolve, the collaboration between human ingenuity and machine intelligence remains one of the cores of advancing scientific discovery and innovation in technology across sectors.

Breakthrough with AI. Discover a better way.

--

--