Neural Yorker

2 minute read

Cartoons are a way of communication.

A very well known format of cartoons is that of the New-Yorker. Similar cartoons have worked in newspapers around the world as the relaxing part of a newspaper page. They catch your attention by acting as form of decompression either through a pan or through their contrast with the heavy and complex topics that a page talks about. Additionally the give an impression of a view of the world through the aesthetic compression technique of a troll or by putting the absurdity of the world in a single sentence.

We made a big collection of such cartoons and created a dataset of texts associated with images. It is a very interesting dataset that can act for research around multimodal learning of data, using transfer learning for achieving and investigating higher levels of understanding.

Using a Language Model and a GAN trained on those data, we created a pipeline where a cartoon is created everyday and posted on our twitter channel.

This is a task really suited to a Language Model, as at their best, in contrary to the popular opinion, Language models are really elegant and intelligent jugglers of words - as happens with newspaper cartoonists. Some people treat language models as AGI and think that giving a huge amount of data to a model, while increasing its layers of transformers, will turn it to an AGI. I agree that consciousness emerges as a property of increasing the complexity of language. But increases in complexity do not account for only increases in operations but in language properties of adaptive-flexible and multimodal compression. This is the reason the human brain doesn’t have 1 trillion neurons dedicated to language. I though believe that the distance to AGI is a technical one, with all the depth this term can achieve. Also such approaches are an expression of concentration of capital, which restrains technological innovation to a matter of resources and leads to an undemocratization of science.

Additionally, this works as a challenge to a lot of generative imagery architectures, which until now work only for images that are not consisted by a set of objects arranged in a scenery, but with the duality of the single centered object and its background.

This is a work in progress and if you would like to support it follow our channel on twitter: https://twitter.com/NeuralYorker. This work is part of the efforts of Applied Memetic funded by Ilan Manouach.

Updated: