@cjd @Moon
Obvious jokes aside, the problem is, that you cannot create a dataset by humans, since humans are incapable of making this distinction themselves.
The entire concept of schizophrenia and inteligence being 2 sides of the same coin does apply here 10fold. Because, briliant people do see paterns that you cannot visualize, that means, that you cannot know, if they actually are smart, or if they are bullshitters.
This is, why most atempts at doing this end up just with recognizing, how niche words you use, since the niche words are needed to make a scientific article. But, you immediately turn to social science loons, who cannot form a single sentence without going full systemic prejudice against margenalized methaphors for cheese.
@cjd @Moon
If you want to make an AI to learn what interests YOU, then even the dumbest "find words I like" system will do the job. As long as the text contains "linux, boot, freeware, software, hardware", It's great.
If it contains "republican, democrat, trump, fuck, Nigger" It's BAD.....
But you changed the goal entierly now. Your original post was about finding intelligence.
@cjd @pepsi_man @Moon
I am sorry, but I thought you are very educated when it comes to the IT.
So, why would you believe, that less information is good for learning?
But the thing is, I can't think of any particular set of words that communicate what I would find interesting - it's like trying to word-filter for what you'll find funny. What do you do? Filter for "knock knock"? Maybe if you're 5.
Most political takes are boring and repetitive. Most Science is horrifying midwittery. Most blockchain takes are spam and get-rich-quick. Most conspiracy takes are aliens and flat earth bullshit. BUT, there's 1% in each category which is a flash of brilliance (IMO) and I'd really like to try to filter for it...