Update readme; update debug prints
This commit is contained in:
parent
f57aef07c0
commit
23f7407366
18
README.md
18
README.md
|
@ -8,12 +8,24 @@
|
|||
|
||||
## High level approach
|
||||
|
||||
todo
|
||||
We started by creating robust abstracted HTTP-handling code, which is located in the `smol-http`
|
||||
module of this project. The HTTP code implements a subset of HTTP 1.1 which is enough to meet the
|
||||
requirements for crawling the target web server. It also uses plain TCP sockets to communicate using
|
||||
its HTTP implementation. We used Racket standard library functions to parse and manipulate URLs as
|
||||
well as parse HTML (as XML, hopefully it's well-formed!) in order to find the hyperlinks on the page
|
||||
as well as the flags. We implemented a high performance Certified Web Scale(tm) crawling scheduler
|
||||
with a distributed work queue to allow for very high rate crawling, the crawler on our machines
|
||||
takes minutes to complete, and finds all the flags very quickly.
|
||||
|
||||
## Challenges
|
||||
|
||||
todo
|
||||
The current pandemic situation continues to make this semester difficult. Otherwise, we didn't run
|
||||
into any major issues during this project.
|
||||
|
||||
## Testing
|
||||
|
||||
todo
|
||||
We unit tested the HTTP handling code in smol-http, and used ad-hoc manual testing against the
|
||||
target server to test the complete crawling functionality.
|
||||
|
||||
We have an additional `-d` flag which will print useful debug info during the execution of the
|
||||
crawler, which may be helpful for manual testing.
|
||||
|
|
|
@ -29,9 +29,10 @@
|
|||
|
||||
;; ->
|
||||
;; Prints a completion message to the console, only when debug mode is on
|
||||
(define (print-complete)
|
||||
(define (print-complete total-pages num-flags)
|
||||
(when (debug-mode?)
|
||||
(printf "\r\x1b[KCrawl complete\n")))
|
||||
(printf "\r\x1b[KCrawl complete: ~a pages crawled, ~a flags found\n"
|
||||
total-pages num-flags)))
|
||||
|
||||
;; Str ->
|
||||
;; Prints a flag
|
||||
|
|
|
@ -157,7 +157,7 @@
|
|||
(set-count completed) (unbox num-flags))
|
||||
|
||||
(loop)))
|
||||
(print-complete)
|
||||
(print-complete (set-count completed) (unbox num-flags))
|
||||
;; send all workers the shutdown message and wait
|
||||
(for ([thd (in-vector worker-threads)])
|
||||
(thread-send thd #f)
|
||||
|
|
Loading…
Reference in New Issue