Blender, Facebook State-of-the-Art Human-Like Chatbot, Now Open Source

erek

[H]F Junkie
Joined
Dec 19, 2005
Messages
10,786
This is actually really cool. I stood up a chat instance myself to play with. Here's some results:

1588559674507.png


"According to Facebook, Blender's edge over Meena can be explained based on Blender's skill blending and generation strategies. Strikingly, human evaluators preferred a conversation with Blender over a conversation with humans 49% of the time, while this figure decreases to 36% when using models unable to blending skills.

Evolution of human-like chatbots does not end with Blender, which still displays a number of shortcomings, like contradicting or repeating itself, or "hallucinate" knowledge, i.e. made-up facts.

We’re currently exploring ways to further improve the conversational quality of our models in longer conversations with new architectures and different loss functions. We’re also focused on building stronger classifiers to filter out harmful language in dialogues. And we’ve seen preliminary success in studies to help mitigate gender bias in chatbots.
Major areas of research for future development are mitigating gender bias, filtering out harmful language, and others. Facebook hopes that Blender can help the AI research community to further advance the state of the art of conversational chatbots."

https://www.infoq.com/news/2020/04/facebook-blender-chatbot/
 
aibot



i have heard of star citizen , but never played it . i ' ll have to check it out .



System
10:35 PM

@abcd joined the team.



aibot

that is good to hear . what else do you do in your free time besides collect video cards ?
 
What's the point of an AI chatbot? For tech support I assume?
I suspect many companies have been using them for answering emails and tech support request for a while now, without disclosing it to the customer. I mean some of the things I had to deal with through tech support a human can't possibly be that stupid.
 
I suspect many companies have been using them for answering emails and tech support request for a while now, without disclosing it to the customer. I mean some of the things I had to deal with through tech support a human can't possibly be that stupid.

Human hired by incompetent management surely can :eek:
 
Man, the 3B Model requires some serious horsepower... 2080 Ti doesn't have enough ram even...

RuntimeError: CUDA out of memory. Tried to allocate 26.00 MiB (GPU 0; 11.00 GiB total capacity; 8.74 GiB already allocated; 2.59 MiB free; 8.75 GiB reserved in total by PyTorch)
 
Wow, it is a beefy program if it requires more than 12GB VRAM right out of the gate. :eek:
 
  • Like
Reactions: erek
like this
Wow, it is a beefy program if it requires more than 12GB VRAM right out of the gate. :eek:
I got the smaller 90M model running on my discord right now. Wish I had enough horsepower to run the 2.7Billion model.

https://discord.gg/UE3rEs

1588920770486.png



[ Using CUDA ]
f:\parlai\parlai\utils\fp16.py:144: UserWarning: You set --fp16 true with --fp16-impl apex, but fp16 with apex is unavailable. To use apex fp16, please install APEX from https://github.com/NVIDIA/apex.
'You set --fp16 true with --fp16-impl apex, but fp16 '
Dictionary: loading dictionary from f:\parlai\data\models\blender/blender_90M/model.dict
[ num words = 54944 ]
[TransformerGenerator: full interactive mode on.]
Total parameters: 87,508,992 (87,508,992 trainable)
[ Loading existing model params from f:\parlai\data\models\blender/blender_90M/model ]
[creating task(s): blended_skill_talk]
[ loading personas.. ]

[NOTE: In the BST paper both partners have a persona.
You can choose to ignore yours, the model never sees it.
In the Blender paper, this was not used for humans.
You can also turn personas off with --include-personas False]

We have logged in as GFWAIBot#4291
 
Last edited:
What's the point of an AI chatbot? For tech support I assume?

Yep. Most of the websites that use them just feed them the data from their Q&A page then try to match your question to one of the listed questions and feeds you the answer.
More advanced ones will add steps to that. Basically you asked question 1 and had a specific answer, now you're on a different set of Q&As and it chooses from those.
Most company's support people do the exact same thing. They're supposed to just follow a script. So it isn't hard to replace them with a chatbot.

A more advanced form of this is the google assistant AI that is supposed to setup appointments and order food for you.


There is a lot more a more advanced chatbot could do. But most of the chatbots today don't really have a memory and form each response with no external context. They just look at the last message.
 
Translation: censorship and discrimination as prescribed by our silicon valley overlords.
Yep, they no longer have to employ people who could eventually wise up and change their mind about what they are censoring. They created this AI bot and let it censor all that they disagree with and they won't have to pay employees to do it.
 
Is there a compiled standalone version I could run on my own discord? I'd love to play with it as an end user but don't really want to go through the hoop jumping of installing Python and the various libraries.
 
Is there a compiled standalone version I could run on my own discord? I'd love to play with it as an end user but don't really want to go through the hoop jumping of installing Python and the various libraries.

learned the hard way on Windows at least... go with Anaconda as your Python provider, then it's not bad at all:

Anaconda3-2020.02-Windows-x86_64
 
There's already an open source program named "Blender" I really wish companies would stop using names that are either already in use or are just plain generic as hell. Feels like an attempt to redefine the word for trademark or other bullshit reasons, or just plain ignorance. "Mixer" is another example of this... wtf there are already several daily-use things called Mixer
 
sharknice i've made some improvements by cleaning up the output,

1590880855339.png




if message.content.startswith(''):
conversation = parlai_speak(world, message.content)
for msg in conversation[-1]:
print(f"{id}: {msg}")
if msg.replace("TransformerGenerator", "~") != '~':
await message.channel.send(msg.replace("TransformerGenerator", "~").replace(" ' ", "'").replace(" ?", "?").replace(" .", ".").replace(" ,", ","))

#os.system('python inference.py -c config.json -f models/flowtron_ljs.pt -w models/waveglow_256channels_v4_new.pt -t \"'+ msg +'\" -i 0')
test(msg.replace(" ' ", "'").replace(" ?", "?").replace(" .", ".").replace(" ,", "'"), args.id, args.n_frames, args.sigma, args.seed)
 
Last edited:
Look at all these people on my discord still talking to the AI Bot !!!!!


1591086284418.png
 
am I alone in pretending to be an AI Bot, just to impress people
 
https://ai.facebook.com/blog/introd...ush-the-limits-of-natural-language-processing

Read the full paper:
Adversarial NLI: A New Benchmark for Natural Language Understanding

Try the Demo
Adversarial NLI




"What the research is:
Benchmarks play a crucial role in driving progress in AI research. They form a common goal for the research community and allow for direct comparisons between different architectures, ideas, and methods. But as research advances, static benchmarks have become limited and saturate quickly, particularly in the field of natural language processing (NLP). For instance, when the GLUE benchmark was introduced in early 2018, NLP researchers achieved human-level performance less than a year later. SuperGLUE added a new set of more difficult tasks, but it was also soon saturated as researchers built models that could achieve “superhuman” performance on the benchmark.

Such static benchmarks can result in building models that not only overfit on these benchmarks but also pick up on inadvertent biases that may exist, rather than truly understanding language. Famously, simply answering the number “2” in response to quantitative “How much?” questions in some QA data sets can yield unexpectedly high accuracy. While there has been rapid progress in NLP, AI systems are still far off from truly understanding natural language. This raises the question, are our benchmarks measuring the right thing? Can we make a benchmark more robust and last longer?

In order to provide a stronger NLP benchmark, we’re introducing a new large-scale data set called Adversarial Natural Language Inference (ANLI). NLI is a core task of NLP and a good proxy for judging how well AI systems understand language. The goal is to determine whether a statement can be inferred (positive entailment) from a given context. For example, the statement “Socrates is a man, and men are mortal” entails “Socrates is mortal,” while “Socrates is immortal” is a contradiction and “Socrates is a philosopher” is neutral.

We took a novel, dynamic approach to building the ANLI data set, in which human annotators purposely fool state-of-the-art models on such NLI tasks, creating valuable, new examples for training stronger models. By repeating this over several rounds, we iteratively push state-of-the-art models to improve on their weaknesses and create increasingly harder test sets. If a model overfits or learns a bias, we can add a new round of examples to challenge the model. As a result, this dynamic, iterative approach makes this task impossible to saturate and represents a new, robust challenge for the NLP community.

How it works:
Our novel approach to data collection is called HAMLET (Human-and-Model-in-the-Loop-Entailment Training). We employed human annotators to write statements that purposely try to make state-of-the-art models predict the wrong label for a given context (or premise). We randomly sampled the contexts from publicly available third-party data sets. If they succeeded in fooling the model, we gave them a bigger reward, incentivizing annotators to come up with hard examples that are valuable for training more robust models. For each human-generated example that is misclassified, we also asked the annotator to provide a reason that they believe the model failed, which is then verified by another person.






The four steps make up one round of data collection. In step 3, model-correct examples are included in the training set. Development and test sets are constructed solely from model-wrong, verified-correct examples.



We repeated the procedure over three rounds, collecting examples against different models that become increasingly stronger as they’re trained on the newly collected data. We show that this process results in annotators creating more difficult examples, which are consequently more valuable for training. The collected examples pose a dynamic challenge for current state-of-the-art systems, which perform poorly on the new data set.

Why it matters:
Current static benchmarks are struggling to keep up with progress in NLP. With our new HAMLET approach and ANLI data set, where both models and humans are in the loop interactively, we can push state-of-the-art models toward meaningful improvements in language understanding."

Dynamic adversarial data collection helps us better measure the strength of our models. The harder it is to fool an NLU system, the stronger its ability to truly understand human-level language. Looking forward, we believe that benchmarks should not be static targets. Instead, the research community should move toward a dynamic approach of benchmarking, with current state-of-the-art models in the loop.
 
Back
Top