From 49e343d82d59c70d272c8fb98cfcdd461dda0bee Mon Sep 17 00:00:00 2001
From: Christian Schabesberger <chris.schabesberger@mailbox.org>
Date: Mon, 26 Mar 2018 08:47:15 +0200
Subject: [PATCH] enhance list extractor description

---
 assets/InfoItemsCollector_objectdiagram.dia   | Bin 1263 -> 1262 bytes
 docs/01_Concept_of_the_extractor.md           |  69 ++++++++++++++----
 docs/img/InfoItemsCollector_objectdiagram.svg |  40 +++++-----
 3 files changed, 76 insertions(+), 33 deletions(-)
diff --git a/assets/InfoItemsCollector_objectdiagram.dia b/assets/InfoItemsCollector_objectdiagram.dia
index f99ccafc4a963b5cd2c847d2a32532b818a84d84..5b0c0a4ef25e5932ac092e879ab2b8337bc9bbd9 100644
GIT binary patch
delta 1150
zcmV-^1cCeS3GNAhABzY8000000t4-vNspsA6oBvbD-d<9CM<zKRVS^{XhzdgkNPmv
zm%c<!aEQANj*Q82ncqHUNhNGafRL)Rj8u_u9`D%?zvWZPA3tq<EUh685ede!rl_(6
zgE?_g;Em;P|Ni;GmOo7%KDr1z^FNOQU*b;$VSG21myE4{o(F@?W~1O;2pFLXMv)T2
z!G8c_FyKN1dGa7h#|m7)Ks=U>0>&tsMGQ&-@ZngVf%#95k|=QHR4S936HKVI26!w#
zE|N<gWNHREJtuUpfCpz3f}i!ORem{68`UdFi)#H92@w}zyH!b~Mo;{CB$G*nTqy7+
zkG~Bc(<x<t0|#AI6RkDEd_X-El<jce;M5ViV(Yrjonbn*qmG6nZSRxj=4Q7whi_|+
z-qsunrz=7k1&EbR&IrK}1gTX<BiP<fI0u+-u~W<Hu!;p@jGUSO0^snH1*F5T(bhC_
zJ&N4Zft!=4(Xv4<TTZt<xhvJa>q+|>g=mIh)$CD!z^<l#cOCV`x07V=!{Z%yL#nN|
zJh(PAq1n7VcoA}8cxIdBWok4nGqJ%ruNUPmbxb-`DAtJy40e!a>i^=e4pNWCby~{a
zaF%i;FlT7JzdLdYnf-usmaX>N-(RE;(${22KC~db0YUCq{;KAERa_lc#nWKPSu1kb
zQR<U_W)xRn&K-5iXS1Si$|ni}55rWyqd4Z1rt{AiZKP-#zZojF+0LR58IaHHa~E+(
zwbBkT^rv$|gL3<E9fu_bk{nfb+CD#x6xA9!N0-^b))FFb$?A=1imm3ZQ9EP20>X0(
zQxz?5OV!#LV{pr+ak7bMc`xKUR8{S;o_<YdED**Qn8>&A%^h_vS_(3CO-|BA=i~|<
z_w(mqK|V9+hrbhyc?ux3rUgGyx9VI)0#;=^9+wutM|j7l2!c?SLbi+ZE6<Ku`h|E6
z8JdIRI?#Lur8uhG--TFGhuw_k^LAw_G8EI3;R1qx3{A1@xFl+K4H+Fgb_rx~?^wKX
z5M&|9;@y)4*XaMTWfa6rp}lg(>yT(h`ff&aQ?WJ6h;ypI41pN}GX!P`%n+DyZOqWG
zff+{cn34XL@yZ#)yXOo;aE9Ov!5KNu&;@6_A!mr6GX!S{&Jdg-I74v8^*O`124`6P
zbB39J<BZYWbA}~2LvV)Rj2veef-~NbGe&|l1ZN1&5S$@6LvTiO&iIEn?Jl4$hmG&Z
z`wqM>ycs$I4lM(YVdpa%$-#-?lV&JRpWG3jJ&4Wl$TeSL3UM4OG1{`TCX_9w$vT>^
zu0qbv{@eLdIrWSjU*IrapxWB2@%jiw`7l0z6~d>gb^UXVN?OEwND{GpEjoAV@sx!*
z2?D-R_?8Y!g>wImwB(_X6&gm8b4k+S+p%ny7FzMFmOZQ0KD$qIg$Z3VSj~G2R^xpJ
zE0?n_k#iJ#Kk<u#)eyn@u7Xv6%V4$cJ6NrIt*q9C!D@+MeOJM1bPd+@J~`k#dB~j&
QfSx@356H6=fF(r$0G&r56aWAK

delta 1176
zcmV;J1ZVs13GWGiABzY8000000t4-vTW{hx6oB9RE28wZ6LN*59T=@vyV{XvH4nS<
zGB3yw6FlR@$Z@#5>~CM^LLoOG387`=NJWi%a*loCbNNL5@zd7F(i+kbkzg!qiYiMm
zm=hNT-dO(j@1GxR`P1a#ql>^Z|M4jBCH_PZ#&=_R$=K?Dc`(>)HVWQ_fDx)-6e%Gb
z{0A@w11>a>Cl8WztiS~f#ADegV2q+!#Gn)aACBc2nE&)Bi2_$nr82oW!Gua{fXDLV
zBDv&2re=`Sb3*qDcyLA`_*t)7<(K2MQN4n+sMcSR5OEQ<Ta`p=^u(V>GMQA!g#vH#
z_?!8dPAMCIIOwXHXsr?E1L~omY=`>>r;gASTi12&49l?{b!3jTy-%8(o88tNzO6ZW
zTXQI!t_WomAXYXxBLqVbq*fV?V0$~^9ALi1PA#j$Di(+_a%TPufWu1`kPg2_Thqw(
zC~{8+Zcd^`%Lch@Io<Z;u2lQ3C+%w#q8Wx&vqu4cyPEplb<`K%PLjP3k9XV+skYkk
z;M&lFR`c@UMaYHWnQfMrsnN8|#0KZQUX;7kG3ijDSSKbh*g=}9|BJsmNIe?YX(@Zd
zS;~>XoT2sp?#L-*_5;pYw%Tuhe~~^&Uy~vE(1P#=1i54RtD5&!adli3PlF|At;k_V
zsZW}JQCxjFcho7L&5F7ypC|}C3{(A%;#g0b&OcwYk)mn*##C&pokbrqAfMUiF5-@A
zr5$4EPv?XN<@Vz`4oeIqIjZcmeSR7#>d<tKE~|sBB}Cqm)f>|kTg_dgcE)%Ggy$Bf
zDq7x_s<ku5;Fe9}WE0WyUdVT-s@h>a{hC;RAdD|Ck#FIfJL+7t6lCg}oTQD;$rU>8
z=g+}{d}h!Oe<v986hLTA3x1+*)wzlUtjcygE-iqM@QzOr1feX2Y!~NOo*lFF3-KB<
zGzZ6Zp!p0+aa6g#3$dh{-Hhh*c4aCu6wA<)jH({zi8j{q1^vPe-^LJcm%YG;)vpwj
z`vPfybVpIOxFl+K4H+Fgb_rzIiZj%$>VzgBLqNtmg$%CI|6|K2i19*ue*IPSBYpQH
zx~Z79rpIYjkcJ=)K^lTI1ZfD;xHf6%*B}j}chX3|&Ui%)^X^f@5U3$gL!d?-YUlzr
z-T*bkml^^!1ZoJ>5U3$g<NBy!Tmv<R{iB9|l|zlu-J`}(poTyVff{wFVF=WC1JoD^
z)DWm4P(z@GKn;N!%~9hY-nYAex|}z@Bkw!#zVIgKh&i;3In2(7HIfq)=96Y9PM_!z
zA3})D@W{1ZVi0khQDU?ud=Sc((_|gZS63lt$N%;`yj)H_BgYqL#tT$ido^Akp(r1J
z#-~E~RJE>uu2D&gcn?V;maj$Ujz6BVFegF4Hwxd<VX08={~|4UC}f3(k>r4qboeW0
zw9-N=p4GBvwc2O*X|6D#YX+-zZ^3H3&tT<p!%O5G#okZ+qF^;du)eEc)!#B$hxZ+<
q!+Wi)!wZ9TD1!A}1*_3DSkwFDr1RttA$Jsd^6)>#i8z`rMF0TEzd{-S

diff --git a/docs/01_Concept_of_the_extractor.md b/docs/01_Concept_of_the_extractor.md
index eb3b7b1..4862404 100644
--- a/docs/01_Concept_of_the_extractor.md
+++ b/docs/01_Concept_of_the_extractor.md
@@ -55,25 +55,68 @@ class MyExtractor extends FutureExtractor {
 
 ## Collector/Extractor pattern for lists
 
-Sometimes information can not be represented as a structure, but as a list. In NewPipe an item of a list is called
-[InfoItem](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItem.html). In order
-to get such items a [InfoItemsCollector](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItemsCollector.html)
-is used. For each item that should be extracted a new Extractor will be given to the InfoItemCollector via [commit()](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItemsCollector.html#commit-E-).
+Sometimes information can be represented as a list. In NewPipe a list is represented by a
+[InfoItemsCollector](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItemsCollector.html).
+A InfoItemCollector will collect and assemble a list of [InfoItem](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItem.html).
+For each item that should be extracted a new Extractor must be created, and given to the InfoItemCollector via [commit()](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItemsCollector.html#commit-E-).
 
 ![InfoItemsCollector_objectdiagram.svg](img/InfoItemsCollector_objectdiagram.svg)
 
-When a streaming site shows a list it usually offers some additional information about that list, like it's title, a thumbnail
+If you are implementing a list for your service you need to extend InfoItem containing the extracted information,
+and implement an [InfoItemExtractor](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/Extractor.html)
+that will return the data of one InfoItem.
+
+A common Implementation would look like this:
+```
+private MyInfoItemCollector collectInfoItemsFromElement(Element e) {
+    MyInfoItemCollector collector = new MyInfoItemCollector(getServiceId());
+
+    for(final Element li : element.children()) {
+        collector.commit(new InfoItemExtractor() {
+            @Override
+            public String getName() throws ParsingException {
+                ...
+            }
+
+            @Override
+            public String getUrl() throws ParsingException {
+                ...
+            }
+            
+            ...
+    }
+    return collector;
+}
+
+```
+
+## InfoItems encapsulated in pages
+
+When a streaming site shows a list of items it usually offers some additional information about that list, like it's title a thumbnail
 or its creator. Such info can be called __list header__.
 
-Also if you open a list in a web browser the website usually does not load the whole list, but only a part
-of it. In order to get more you may have to click on a next page button, or scroll down. This is why a list in
-NewPipe is coped down into [InfoItemPage](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.InfoItemPage.html)s. Each Page has its own URL, and needs to be extracted separately.
-
-List header information and extracting multiple pages of an InfoItem list can be handled by a
-[ListExtractor](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html)
-
-
+When a website shows a long list of items it usually does not load the whole list, but only a part of it. In order to get more items you may have to click on a next page button, or scroll down. 
 
+This is why a list in NewPipe lists are chopped down into smaller lists called [InfoItemsPage](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.InfoItemsPage.html)s. Each page has its own URL, and needs to be extracted separately.
+
+Additional metainformation about the list such as it's title a thumbnail
+or its creator, and extracting multiple pages can be handled by a
+[ListExtractor](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html),
+and it's [ListExtractor.InfoItemsPage](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.InfoItemsPage.html).
+
+For extracting list header information it behaves like a regular extractor. For handling `InfoItemsPages` it adds methods
+such as:
+
+ - [getInitialPage()](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html#getInitialPage--)
+   which will return the first page of InfoItems.
+ - [getNextPageUrl()](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html#getNextPageUrl--)
+   If a second Page of InfoItems is available this will return the URL pointing to them.
+ - [getPage()](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html#getPage-java.lang.String-)
+   returns a ListExtractor.InfoItemsPage by its URL which was retrieved by the `getNextPageUrl()` method of the previous page.
+
+
+The reason why the first page is handled speciall is because many Websites such as Youtube will load the first page of
+items like a regular webpage, but all the others as AJAX request.
 
 
 
diff --git a/docs/img/InfoItemsCollector_objectdiagram.svg b/docs/img/InfoItemsCollector_objectdiagram.svg
index 2f986a9..d661de9 100644
--- a/docs/img/InfoItemsCollector_objectdiagram.svg
+++ b/docs/img/InfoItemsCollector_objectdiagram.svg
@@ -1,39 +1,39 @@
 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
 <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/PR-SVG-20010719/DTD/svg10.dtd">
-<svg width="20cm" height="8cm" viewBox="199 199 382 159" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<svg width="22cm" height="8cm" viewBox="199 199 435 159" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
   <g>
     <rect style="fill: #ffffff" x="200" y="260" width="141.3" height="36"/>
     <rect style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" x="200" y="260" width="141.3" height="36"/>
-    <text font-size="12.7998" style="fill: #000000;text-anchor:middle;font-family:sans-serif;font-style:normal;font-weight:normal" x="270.65" y="281.9">
+    <text font-size="12.8" style="fill: #000000;text-anchor:middle;font-family:sans-serif;font-style:normal;font-weight:normal" x="270.65" y="281.9">
       <tspan x="270.65" y="281.9">:InfoItemsCollector</tspan>
     </text>
     <line style="fill: none; fill-opacity:0; stroke-width: 1; stroke: #000000" x1="210" y1="284.9" x2="331.3" y2="284.9"/>
   </g>
   <g>
-    <rect style="fill: #ffffff" x="400" y="200" width="179.25" height="36"/>
-    <rect style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" x="400" y="200" width="179.25" height="36"/>
-    <text font-size="12.7998" style="fill: #000000;text-anchor:middle;font-family:sans-serif;font-style:normal;font-weight:normal" x="489.625" y="221.9">
-      <tspan x="489.625" y="221.9">itemExtractor1:Extractor</tspan>
+    <rect style="fill: #ffffff" x="400" y="200" width="232.65" height="36"/>
+    <rect style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" x="400" y="200" width="232.65" height="36"/>
+    <text font-size="12.8" style="fill: #000000;text-anchor:middle;font-family:sans-serif;font-style:normal;font-weight:normal" x="516.325" y="221.9">
+      <tspan x="516.325" y="221.9">itemExtractor1:InfoItemExtractor</tspan>
     </text>
-    <line style="fill: none; fill-opacity:0; stroke-width: 1; stroke: #000000" x1="410" y1="224.9" x2="569.25" y2="224.9"/>
+    <line style="fill: none; fill-opacity:0; stroke-width: 1; stroke: #000000" x1="410" y1="224.9" x2="622.65" y2="224.9"/>
   </g>
   <g>
-    <rect style="fill: #ffffff" x="400" y="260" width="179.25" height="36"/>
-    <rect style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" x="400" y="260" width="179.25" height="36"/>
-    <text font-size="12.7998" style="fill: #000000;text-anchor:middle;font-family:sans-serif;font-style:normal;font-weight:normal" x="489.625" y="281.9">
-      <tspan x="489.625" y="281.9">itemExtractor2:Extractor</tspan>
+    <rect style="fill: #ffffff" x="400" y="260" width="232.65" height="36"/>
+    <rect style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" x="400" y="260" width="232.65" height="36"/>
+    <text font-size="12.8" style="fill: #000000;text-anchor:middle;font-family:sans-serif;font-style:normal;font-weight:normal" x="516.325" y="281.9">
+      <tspan x="516.325" y="281.9">itemExtractor2:InfoItemExtractor</tspan>
     </text>
-    <line style="fill: none; fill-opacity:0; stroke-width: 1; stroke: #000000" x1="410" y1="284.9" x2="569.25" y2="284.9"/>
+    <line style="fill: none; fill-opacity:0; stroke-width: 1; stroke: #000000" x1="410" y1="284.9" x2="622.65" y2="284.9"/>
   </g>
   <g>
-    <rect style="fill: #ffffff" x="400" y="320" width="179.25" height="36"/>
-    <rect style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" x="400" y="320" width="179.25" height="36"/>
-    <text font-size="12.7998" style="fill: #000000;text-anchor:middle;font-family:sans-serif;font-style:normal;font-weight:normal" x="489.625" y="341.9">
-      <tspan x="489.625" y="341.9">itemExtractor3:Extractor</tspan>
+    <rect style="fill: #ffffff" x="400" y="320" width="232.65" height="36"/>
+    <rect style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" x="400" y="320" width="232.65" height="36"/>
+    <text font-size="12.8" style="fill: #000000;text-anchor:middle;font-family:sans-serif;font-style:normal;font-weight:normal" x="516.325" y="341.9">
+      <tspan x="516.325" y="341.9">itemExtractor3:InfoItemExtractor</tspan>
     </text>
-    <line style="fill: none; fill-opacity:0; stroke-width: 1; stroke: #000000" x1="410" y1="344.9" x2="569.25" y2="344.9"/>
+    <line style="fill: none; fill-opacity:0; stroke-width: 1; stroke: #000000" x1="410" y1="344.9" x2="622.65" y2="344.9"/>
   </g>
-  <polyline style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" points="342.309,278 370.652,278 370.652,218 398.994,218 "/>
-  <polyline style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" points="342.309,278 343.309,278 397.994,278 398.994,278 "/>
-  <polyline style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" points="342.309,278 370.652,278 370.652,338 398.994,338 "/>
+  <polyline style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" points="342.309,278 370.651,278 370.651,218 398.993,218 "/>
+  <polyline style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" points="342.309,278 343.309,278 397.993,278 398.993,278 "/>
+  <polyline style="fill: none; fill-opacity:0; stroke-width: 2; stroke: #000000" points="342.309,278 370.651,278 370.651,338 398.993,338 "/>
 </svg>