I agree with the point you're making here, but it’s also funny that the descript...

mattmanser · on April 11, 2023

Can anyone answer the chance that example tests of these questions were in its training set?

And it's just regurgitating the answers someone else wrote?

As I imagine it's a very high chance given how much uni lecturers recycle exam questions.

When I was at uni you could just get the last 5 years worth of questions from the library for almost any subject and guess what the questions were probably going to be. Often they just changed a few numbers.

Teaching undergrads is like a sausage factory, the actual intellectual value for undergrads is in the seminars, the practical value in the labs. The rest is showing you can regurgitate what you've been told.

Which ChatGPT excels at.

skepticATX · on April 11, 2023

I don’t know anything about quantum computing. I successfully answered a few true/false questions by pasting them into Google.

The exact questions aren’t in the top results, but the answers are.

M4v3R · on April 11, 2023

From the article:

> To the best of my knowledge—and I double-checked—this exam has never before been posted on the public Internet, and could not have appeared in GPT-4’s training data.

sudosysgen · on April 12, 2023

The exam, no, but most of the questions most certainly are. I know this because I've done extremely similar problems for homework and checked my answers online.

sudosysgen · on April 11, 2023

Yes, the vast majority of these questions are standard known problems it definitely already saw with a slightly different formulation.

moonchrome · on April 11, 2023

You can try phrasing the question in a way that it wouldn't be phrased but would still demonstrate understanding of concept.

I remember Yann LeCun gave an interview and he came up with some random question like "If I'm holding a peace of paper with both of my hands above the desk and I release one what would happen". His point was that since the LLM doesn't have a world model it wouldn't be able to answer these trivial intuitive questions unless it saw something similar in the training set. And then the interviewer tried it and it failed. That was 3.5. I've tried many variation of that class of problem with 4 and it seems to generalize basic physics concepts quite well. So maybe 4 learned basic physics ? Why couldn't it learn QM theory as well ?

jltsiren · on April 11, 2023

For a college graduate, that is the starting point. Test results are supposed to signal that the person can learn new things. While a fresh graduate needs a lot of supervision, they should quickly become more capable and productive.

For a language model, test results are the end. They are supposed to measure what the model is capable of. If you need better performance, you must train a better model.

ChatGTP · on April 11, 2023

It the college graduates who aren’t the way you describe, those who show initiative and responsibility in their work are the best hires. So not much changes.