It is assumed that you already have training and test data. The data is made from many examples (I'm using 684K examples), each example is made from the text from the start of the article, which I call description (or desc), and the text of the original headline (or head). The texts should be already tokenized and the tokens separated by spaces. Once you have the data ready save it in a python pickle file as a tuple: (heads, descs, keywords) were heads is a list of all the head strings, descs is a list of all the article strings in the same order and length as heads. I ignore the keywrods information so you can place None.