newpipe-documentation/01_Concept_of_the_extractor/index.html

<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
<head>
  <meta charset="utf-8">
  <meta http-equiv="X-UA-Compatible" content="IE=edge">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  
  
  <link rel="shortcut icon" href="../img/favicon.ico">
  <title>Concept of the Extractor - NewPipe Documentation</title>
  <!-- local fonts -->
  <link rel="stylesheet" href="../css/local_fonts.css" type="text/css" />

  <link rel="stylesheet" href="../css/theme.css" type="text/css" />
  <link rel="stylesheet" href="../css/theme_extra.css" type="text/css" />
  <!-- local code syntax highlighting -->
  <link rel="stylesheet" href="../css/github.min.css" type="text/css" />
  <link rel="stylesheet" href="../css/highlight.css" type="text/css" />
  
  <script>
    // Current page data
    var mkdocs_page_name = "Concept of the Extractor";
    var mkdocs_page_input_path = "01_Concept_of_the_extractor.md";
    var mkdocs_page_url = null;
  </script>
  
  <script src="../js/jquery-2.1.1.min.js" defer></script>
  <script src="../js/modernizr-2.8.3.min.js" defer></script>
  <script src="../js/highlight.min.js"></script>
  <script>hljs.initHighlightingOnLoad();</script> 
  
</head>

<body class="wy-body-for-nav" role="document">

  <div class="wy-grid-for-nav">

    
    <nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">
      <div class="wy-side-nav-search">
        <a href=".." class="icon icon-home"> NewPipe Documentation</a>
        <div role="search">
  <form id ="rtd-search-form" class="wy-form" action="../search.html" method="get">
    <input type="text" name="q" placeholder="Search docs" title="Type search term here" />
  </form>
</div>
      </div>

      <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
	<ul class="current">
	  
          
            <li class="toctree-l1">
		
    <a class="" href="..">Welcome to NewPipe.</a>
	    </li>
          
            <li class="toctree-l1">
		
    <a class="" href="../00_Prepare_everything/">Before You Start</a>
	    </li>
          
            <li class="toctree-l1 current">
		
    <a class="current" href="./">Concept of the Extractor</a>
    <ul class="subnav">
            
    <li class="toctree-l2"><a href="#concept-of-the-extractor">Concept of the Extractor</a></li>
    
        <ul>
        
            <li><a class="toctree-l3" href="#the-collectorextractor-pattern">The Collector/Extractor Pattern</a></li>
        
            <li><a class="toctree-l3" href="#collectorextractor-pattern-for-lists">Collector/Extractor Pattern for Lists</a></li>
        
            <li><a class="toctree-l3" href="#infoitems-encapsulated-in-pages">InfoItems Encapsulated in Pages</a></li>
        
        </ul>
    

    </ul>
	    </li>
          
            <li class="toctree-l1">
		
    <a class="" href="../02_Concept_of_LinkHandler/">Concept of the LinkHandler</a>
	    </li>
          
            <li class="toctree-l1">
		
    <a class="" href="../03_Implement_a_service/">Implementing a Service</a>
	    </li>
          
            <li class="toctree-l1">
		
    <a class="" href="../04_Run_changes_in_App/">Testing Your Changes in the App</a>
	    </li>
          
            <li class="toctree-l1">
		
    <a class="" href="../05_releasing/">Releasing a New NewPipe Version</a>
	    </li>
          
            <li class="toctree-l1">
		
    <a class="" href="../06_documentation/">About This Documentation</a>
	    </li>
          
        </ul>
      </div>
      &nbsp;
    </nav>

    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">

      
      <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
        <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
        <a href="..">NewPipe Documentation</a>
      </nav>

      
      <div class="wy-nav-content">
        <div class="rst-content">
          <div role="navigation" aria-label="breadcrumbs navigation">
  <ul class="wy-breadcrumbs">
    <li><a href="..">Docs</a> &raquo;</li>
    
      
    
    <li>Concept of the Extractor</li>
    <li class="wy-breadcrumbs-aside">
      
    </li>
  </ul>
  <hr/>
</div>
          <div role="main">
            <div class="section">
              
                <h1 id="concept-of-the-extractor">Concept of the Extractor</h1>
<h2 id="the-collectorextractor-pattern">The Collector/Extractor Pattern</h2>
<p>Before you start coding your own service, you need to understand the basic concept of the extractor itself. There is a pattern
you will find all over the code, called the <strong>extractor/collector</strong> pattern. The idea behind it is that
the <a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/Extractor.html">extractor</a>
would produce fragments of data, and the collector would collect them and assemble that data into a readable format for the front end.
The collector also controls the parsing process, and takes care of error handling. So, if the extractor fails at any
point, the collector will decide whether or not it should continue parsing. This requires the extractor to be made out of
multiple methods, one method for every data field the collector wants to have. The collectors are provided by NewPipe.
You need to take care of the extractors.</p>
<h3 id="usage-in-the-front-end">Usage in the Front End</h3>
<p>A typical call for retrieving data from a website would look like this:</p>
<pre><code class="java">Info info;
try {
    // Create a new Extractor with a given context provided as parameter.
    Extractor extractor = new Extractor(some_meta_info);
    // Retrieves the data form extractor and builds info package.
    info = Info.getInfo(extractor);
} catch(Exception e) {
    // handle errors when collector decided to break up extraction
}
</code></pre>

<h3 id="typical-implementation-of-a-single-data-extractor">Typical Implementation of a Single Data Extractor</h3>
<p>The typical implementation of a single data extractor, on the other hand, would look like this:</p>
<pre><code class="java">class MyExtractor extends FutureExtractor {

    public MyExtractor(RequiredInfo requiredInfo, ForExtraction forExtraction) {
        super(requiredInfo, forExtraction);

        ...
    }

    @Override
    public void fetch() {
        // Actually fetch the page data here
    }

    @Override
    public String someDataFiled() 
        throws ExtractionException {    //The exception needs to be thrown if someting failed
        // get piece of information and return it
    }

    ...                                 // More datafields
}
</code></pre>

<h2 id="collectorextractor-pattern-for-lists">Collector/Extractor Pattern for Lists</h2>
<p>Information can be represented as a list. In NewPipe, a list is represented by a
<a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItemsCollector.html">InfoItemsCollector</a>.
A InfoItemCollector will collect and assemble a list of <a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItem.html">InfoItem</a>.
For each item that should be extracted, a new Extractor must be created, and given to the InfoItemCollector via <a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItemsCollector.html#commit-E-">commit()</a>.</p>
<p><img alt="InfoItemsCollector_objectdiagram.svg" src="../img/InfoItemsCollector_objectdiagram.svg" /></p>
<p>If you are implementing a list for your service you need to extend InfoItem containing the extracted information
and implement an <a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/Extractor.html">InfoItemExtractor</a>,
that will return the data of one InfoItem.</p>
<p>A common implementation would look like this:</p>
<pre><code>private MyInfoItemCollector collectInfoItemsFromElement(Element e) {
    MyInfoItemCollector collector = new MyInfoItemCollector(getServiceId());

    for(final Element li : element.children()) {
        collector.commit(new InfoItemExtractor() {
            @Override
            public String getName() throws ParsingException {
                ...
            }

            @Override
            public String getUrl() throws ParsingException {
                ...
            }

            ...
    }
    return collector;
}

</code></pre>

<h2 id="infoitems-encapsulated-in-pages">InfoItems Encapsulated in Pages</h2>
<p>When a streaming site shows a list of items, it usually offers some additional information about that list like its title, a thumbnail,
and its creator. Such info can be called <strong>list header</strong>.</p>
<p>When a website shows a long list of items it usually does not load the whole list, but only a part of it. In order to get more items you may have to click on a next page button, or scroll down. </p>
<p>This is why a list in NewPipe lists are chopped down into smaller lists called <a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.InfoItemsPage.html">InfoItemsPage</a>s. Each page has its own URL, and needs to be extracted separately.</p>
<p>Additional metadata about the list and extracting multiple pages can be handled by a
<a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html">ListExtractor</a>,
and its <a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.InfoItemsPage.html">ListExtractor.InfoItemsPage</a>.</p>
<p>For extracting list header information it behaves like a regular extractor. For handling <code>InfoItemsPages</code> it adds methods
such as:</p>
<ul>
<li><a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html#getInitialPage--">getInitialPage()</a>
   which will return the first page of InfoItems.</li>
<li><a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html#getNextPageUrl--">getNextPageUrl()</a>
   If a second Page of InfoItems is available this will return the URL pointing to them.</li>
<li><a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html#getPage-java.lang.String-">getPage()</a>
   returns a ListExtractor.InfoItemsPage by its URL which was retrieved by the <code>getNextPageUrl()</code> method of the previous page.</li>
</ul>
<p>The reason why the first page is handled special is because many Websites such as YouTube will load the first page of
items like a regular web page, but all the others as an AJAX request.</p>
              
            </div>
          </div>
          <footer>
  
    <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
      
        <a href="../02_Concept_of_LinkHandler/" class="btn btn-neutral float-right" title="Concept of the LinkHandler">Next <span class="icon icon-circle-arrow-right"></span></a>
      
      
        <a href="../00_Prepare_everything/" class="btn btn-neutral" title="Before You Start"><span class="icon icon-circle-arrow-left"></span> Previous</a>
      
    </div>
  

  <hr/>

  <div role="contentinfo">
    <!-- Copyright etc -->
    
  </div>

  Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
      
        </div>
      </div>

    </section>

  </div>

  <div class="rst-versions" role="note" style="cursor: pointer">
    <span class="rst-current-version" data-toggle="rst-current-version">
      
      
        <span><a href="../00_Prepare_everything/" style="color: #fcfcfc;">&laquo; Previous</a></span>
      
      
        <span style="margin-left: 15px"><a href="../02_Concept_of_LinkHandler/" style="color: #fcfcfc">Next &raquo;</a></span>
      
    </span>
</div>
    <script>var base_url = '..';</script>
    <script src="../js/theme.js" defer></script>
      <script src="../search/main.js" defer></script>

</body>
</html>
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`<!DOCTYPE html>`
			`<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->`
			`<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->`
			`<head>`
			`<meta charset="utf-8">`
			`<meta http-equiv="X-UA-Compatible" content="IE=edge">`
			`<meta name="viewport" content="width=device-width, initial-scale=1.0">`


			`<link rel="shortcut icon" href="../img/favicon.ico">`
Deployed 49e343d with MkDocs version: 0.17.2 2018-04-08 20:02:44 +00:00			`<title>Concept of the Extractor - NewPipe Documentation</title>`
Deployed a308278 with MkDocs version: 1.0.3 2018-09-08 17:06:35 +00:00			`<!-- local fonts -->`
Deployed 8ce9f05 with MkDocs version: 0.17.2 2018-02-23 20:18:58 +00:00			`<link rel="stylesheet" href="../css/local_fonts.css" type="text/css" />`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00
			`<link rel="stylesheet" href="../css/theme.css" type="text/css" />`
			`<link rel="stylesheet" href="../css/theme_extra.css" type="text/css" />`
Deployed a308278 with MkDocs version: 1.0.3 2018-09-08 17:06:35 +00:00			`<!-- local code syntax highlighting -->`
			`<link rel="stylesheet" href="../css/github.min.css" type="text/css" />`
Deployed 8ce9f05 with MkDocs version: 0.17.2 2018-02-23 20:18:58 +00:00			`<link rel="stylesheet" href="../css/highlight.css" type="text/css" />`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00
			`<script>`
			`// Current page data`
Deployed b864491 with MkDocs version: 0.17.2 2018-02-24 21:17:40 +00:00			`var mkdocs_page_name = "Concept of the Extractor";`
			`var mkdocs_page_input_path = "01_Concept_of_the_extractor.md";`
Deployed 5904707 with MkDocs version: 1.0.3 2018-09-01 13:48:12 +00:00			`var mkdocs_page_url = null;`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`</script>`

Deployed 5904707 with MkDocs version: 1.0.3 2018-09-01 13:48:12 +00:00			`<script src="../js/jquery-2.1.1.min.js" defer></script>`
			`<script src="../js/modernizr-2.8.3.min.js" defer></script>`
Deployed a308278 with MkDocs version: 1.0.3 2018-09-08 17:06:35 +00:00			`<script src="../js/highlight.min.js"></script>`
Deployed 5904707 with MkDocs version: 1.0.3 2018-09-01 13:48:12 +00:00			`<script>hljs.initHighlightingOnLoad();</script>`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00
			`</head>`

			`<body class="wy-body-for-nav" role="document">`

			`<div class="wy-grid-for-nav">`


			`<nav data-toggle="wy-nav-shift" class="wy-nav-side stickynav">`
			`<div class="wy-side-nav-search">`
Deployed 49e343d with MkDocs version: 0.17.2 2018-04-08 20:02:44 +00:00			`<a href=".." class="icon icon-home"> NewPipe Documentation</a>`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`<div role="search">`
			`<form id ="rtd-search-form" class="wy-form" action="../search.html" method="get">`
Deployed 5904707 with MkDocs version: 1.0.3 2018-09-01 13:48:12 +00:00			`<input type="text" name="q" placeholder="Search docs" title="Type search term here" />`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`</form>`
			`</div>`
			`</div>`

			`<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">`
			`<ul class="current">`


			`<li class="toctree-l1">`

Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<a class="" href="..">Welcome to NewPipe.</a>`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`</li>`

			`<li class="toctree-l1">`

Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<a class="" href="../00_Prepare_everything/">Before You Start</a>`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`</li>`

			`<li class="toctree-l1 current">`

Deployed b864491 with MkDocs version: 0.17.2 2018-02-24 21:17:40 +00:00			`<a class="current" href="./">Concept of the Extractor</a>`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`<ul class="subnav">`

Deployed b864491 with MkDocs version: 0.17.2 2018-02-24 21:17:40 +00:00			`<li class="toctree-l2"><a href="#concept-of-the-extractor">Concept of the Extractor</a></li>`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00
			`<ul>`

Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<li><a class="toctree-l3" href="#the-collectorextractor-pattern">The Collector/Extractor Pattern</a></li>`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<li><a class="toctree-l3" href="#collectorextractor-pattern-for-lists">Collector/Extractor Pattern for Lists</a></li>`
Deployed b864491 with MkDocs version: 0.17.2 2018-02-24 21:17:40 +00:00
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<li><a class="toctree-l3" href="#infoitems-encapsulated-in-pages">InfoItems Encapsulated in Pages</a></li>`
Deployed 0917de4 with MkDocs version: 0.17.2 2018-03-26 06:47:05 +00:00
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`</ul>`


			`</ul>`
			`</li>`

Deployed 5904707 with MkDocs version: 1.0.3 2018-09-01 13:48:12 +00:00			`<li class="toctree-l1">`

Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<a class="" href="../02_Concept_of_LinkHandler/">Concept of the LinkHandler</a>`
Deployed 5904707 with MkDocs version: 1.0.3 2018-09-01 13:48:12 +00:00			`</li>`

Deployed d58c02c with MkDocs version: 1.0.3 2018-09-09 15:02:52 +00:00			`<li class="toctree-l1">`

Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<a class="" href="../03_Implement_a_service/">Implementing a Service</a>`
Deployed 829b84d with MkDocs version: 1.0.3 2018-09-11 18:21:55 +00:00			`</li>`

			`<li class="toctree-l1">`

Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<a class="" href="../04_Run_changes_in_App/">Testing Your Changes in the App</a>`
Deployed d58c02c with MkDocs version: 1.0.3 2018-09-09 15:02:52 +00:00			`</li>`

Deployed 6137427 with MkDocs version: 1.0.4 2019-01-12 15:30:56 +00:00			`<li class="toctree-l1">`

Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<a class="" href="../05_releasing/">Releasing a New NewPipe Version</a>`
Deployed 398c072 with MkDocs version: 1.0.4 2019-02-20 18:03:04 +00:00			`</li>`

			`<li class="toctree-l1">`

Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<a class="" href="../06_documentation/">About This Documentation</a>`
Deployed 6137427 with MkDocs version: 1.0.4 2019-01-12 15:30:56 +00:00			`</li>`

Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`</ul>`
			`</div>`
			` `
			`</nav>`

			`<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">`


			`<nav class="wy-nav-top" role="navigation" aria-label="top navigation">`
			`<i data-toggle="wy-nav-top" class="fa fa-bars"></i>`
Deployed 49e343d with MkDocs version: 0.17.2 2018-04-08 20:02:44 +00:00			`<a href="..">NewPipe Documentation</a>`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`</nav>`


			`<div class="wy-nav-content">`
			`<div class="rst-content">`
			`<div role="navigation" aria-label="breadcrumbs navigation">`
			`<ul class="wy-breadcrumbs">`
			`<li><a href="..">Docs</a> »</li>`



Deployed b864491 with MkDocs version: 0.17.2 2018-02-24 21:17:40 +00:00			`<li>Concept of the Extractor</li>`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`<li class="wy-breadcrumbs-aside">`

			`</li>`
			`</ul>`
			`<hr/>`
			`</div>`
			`<div role="main">`
			`<div class="section">`

Deployed b864491 with MkDocs version: 0.17.2 2018-02-24 21:17:40 +00:00			`<h1 id="concept-of-the-extractor">Concept of the Extractor</h1>`
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<h2 id="the-collectorextractor-pattern">The Collector/Extractor Pattern</h2>`
			`<p>Before you start coding your own service, you need to understand the basic concept of the extractor itself. There is a pattern`
			`you will find all over the code, called the <strong>extractor/collector</strong> pattern. The idea behind it is that`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`the <a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/Extractor.html">extractor</a>`
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`would produce fragments of data, and the collector would collect them and assemble that data into a readable format for the front end.`
			`The collector also controls the parsing process, and takes care of error handling. So, if the extractor fails at any`
Deployed 479eaaf with MkDocs version: 1.0.4 2018-12-14 08:57:47 +00:00			`point, the collector will decide whether or not it should continue parsing. This requires the extractor to be made out of`
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`multiple methods, one method for every data field the collector wants to have. The collectors are provided by NewPipe.`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`You need to take care of the extractors.</p>`
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<h3 id="usage-in-the-front-end">Usage in the Front End</h3>`
			`<p>A typical call for retrieving data from a website would look like this:</p>`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`<pre><code class="java">Info info;`
			`try {`
Deployed 8ce9f05 with MkDocs version: 0.17.2 2018-02-23 20:18:58 +00:00			`// Create a new Extractor with a given context provided as parameter.`
Deployed b864491 with MkDocs version: 0.17.2 2018-02-24 21:17:40 +00:00			`Extractor extractor = new Extractor(some_meta_info);`
Deployed 8ce9f05 with MkDocs version: 0.17.2 2018-02-23 20:18:58 +00:00			`// Retrieves the data form extractor and builds info package.`
			`info = Info.getInfo(extractor);`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`} catch(Exception e) {`
Deployed 8ce9f05 with MkDocs version: 0.17.2 2018-02-23 20:18:58 +00:00			`// handle errors when collector decided to break up extraction`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`}`
			`</code></pre>`
Deployed b864491 with MkDocs version: 0.17.2 2018-02-24 21:17:40 +00:00
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<h3 id="typical-implementation-of-a-single-data-extractor">Typical Implementation of a Single Data Extractor</h3>`
			`<p>The typical implementation of a single data extractor, on the other hand, would look like this:</p>`
Deployed b864491 with MkDocs version: 0.17.2 2018-02-24 21:17:40 +00:00			`<pre><code class="java">class MyExtractor extends FutureExtractor {`

			`public MyExtractor(RequiredInfo requiredInfo, ForExtraction forExtraction) {`
			`super(requiredInfo, forExtraction);`

			`...`
			`}`

			`@Override`
			`public void fetch() {`
			`// Actually fetch the page data here`
			`}`

			`@Override`
			`public String someDataFiled()`
			`throws ExtractionException { //The exception needs to be thrown if someting failed`
			`// get piece of information and return it`
			`}`

			`... // More datafields`
			`}`
			`</code></pre>`

Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<h2 id="collectorextractor-pattern-for-lists">Collector/Extractor Pattern for Lists</h2>`
			`<p>Information can be represented as a list. In NewPipe, a list is represented by a`
Deployed 0917de4 with MkDocs version: 0.17.2 2018-03-26 06:47:05 +00:00			`<a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItemsCollector.html">InfoItemsCollector</a>.`
			`A InfoItemCollector will collect and assemble a list of <a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItem.html">InfoItem</a>.`
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`For each item that should be extracted, a new Extractor must be created, and given to the InfoItemCollector via <a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItemsCollector.html#commit-E-">commit()</a>.</p>`
Deployed 9410de6 with MkDocs version: 1.0.4 2018-11-16 18:23:26 +00:00			`<p><img alt="InfoItemsCollector_objectdiagram.svg" src="../img/InfoItemsCollector_objectdiagram.svg" /></p>`
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<p>If you are implementing a list for your service you need to extend InfoItem containing the extracted information`
			`and implement an <a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/Extractor.html">InfoItemExtractor</a>,`
Deployed 0917de4 with MkDocs version: 0.17.2 2018-03-26 06:47:05 +00:00			`that will return the data of one InfoItem.</p>`
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<p>A common implementation would look like this:</p>`
Deployed 0917de4 with MkDocs version: 0.17.2 2018-03-26 06:47:05 +00:00			`<pre><code>private MyInfoItemCollector collectInfoItemsFromElement(Element e) {`
			`MyInfoItemCollector collector = new MyInfoItemCollector(getServiceId());`

			`for(final Element li : element.children()) {`
			`collector.commit(new InfoItemExtractor() {`
			`@Override`
			`public String getName() throws ParsingException {`
			`...`
			`}`

			`@Override`
			`public String getUrl() throws ParsingException {`
			`...`
			`}`

			`...`
			`}`
			`return collector;`
			`}`

			`</code></pre>`

Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<h2 id="infoitems-encapsulated-in-pages">InfoItems Encapsulated in Pages</h2>`
			`<p>When a streaming site shows a list of items, it usually offers some additional information about that list like its title, a thumbnail,`
			`and its creator. Such info can be called <strong>list header</strong>.</p>`
Deployed 0917de4 with MkDocs version: 0.17.2 2018-03-26 06:47:05 +00:00			`<p>When a website shows a long list of items it usually does not load the whole list, but only a part of it. In order to get more items you may have to click on a next page button, or scroll down. </p>`
			`<p>This is why a list in NewPipe lists are chopped down into smaller lists called <a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.InfoItemsPage.html">InfoItemsPage</a>s. Each page has its own URL, and needs to be extracted separately.</p>`
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<p>Additional metadata about the list and extracting multiple pages can be handled by a`
Deployed 0917de4 with MkDocs version: 0.17.2 2018-03-26 06:47:05 +00:00			`<a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html">ListExtractor</a>,`
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`and its <a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.InfoItemsPage.html">ListExtractor.InfoItemsPage</a>.</p>`
Deployed 0917de4 with MkDocs version: 0.17.2 2018-03-26 06:47:05 +00:00			`<p>For extracting list header information it behaves like a regular extractor. For handling <code>InfoItemsPages</code> it adds methods`
			`such as:</p>`
			`<ul>`
			`<li><a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html#getInitialPage--">getInitialPage()</a>`
			`which will return the first page of InfoItems.</li>`
			`<li><a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html#getNextPageUrl--">getNextPageUrl()</a>`
			`If a second Page of InfoItems is available this will return the URL pointing to them.</li>`
			`<li><a href="https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html#getPage-java.lang.String-">getPage()</a>`
			`returns a ListExtractor.InfoItemsPage by its URL which was retrieved by the <code>getNextPageUrl()</code> method of the previous page.</li>`
			`</ul>`
Deployed 32e3e06 with MkDocs version: 1.0.3 2018-09-21 20:40:35 +00:00			`<p>The reason why the first page is handled special is because many Websites such as YouTube will load the first page of`
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`items like a regular web page, but all the others as an AJAX request.</p>`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00
			`</div>`
			`</div>`
			`<footer>`

			`<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">`

Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<a href="../02_Concept_of_LinkHandler/" class="btn btn-neutral float-right" title="Concept of the LinkHandler">Next <span class="icon icon-circle-arrow-right"></span></a>`
Deployed 5904707 with MkDocs version: 1.0.3 2018-09-01 13:48:12 +00:00
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00
Deployed ccf8787 with MkDocs version: 1.0.4 2019-03-01 09:02:33 +00:00			`<a href="../00_Prepare_everything/" class="btn btn-neutral" title="Before You Start"><span class="icon icon-circle-arrow-left"></span> Previous</a>`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00
			`</div>`


			`<hr/>`

			`<div role="contentinfo">`
			`<!-- Copyright etc -->`

			`</div>`

			`Built with <a href="http://www.mkdocs.org">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.`
			`</footer>`

			`</div>`
			`</div>`

			`</section>`

			`</div>`

			`<div class="rst-versions" role="note" style="cursor: pointer">`
			`<span class="rst-current-version" data-toggle="rst-current-version">`


			`<span><a href="../00_Prepare_everything/" style="color: #fcfcfc;">« Previous</a></span>`


Deployed 5904707 with MkDocs version: 1.0.3 2018-09-01 13:48:12 +00:00			`<span style="margin-left: 15px"><a href="../02_Concept_of_LinkHandler/" style="color: #fcfcfc">Next »</a></span>`

Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00			`</span>`
			`</div>`
			`<script>var base_url = '..';</script>`
Deployed 5904707 with MkDocs version: 1.0.3 2018-09-01 13:48:12 +00:00			`<script src="../js/theme.js" defer></script>`
			`<script src="../search/main.js" defer></script>`
Deployed b3d1bbc with MkDocs version: 0.17.2 2018-02-22 18:22:22 +00:00
			`</body>`
			`</html>`