The Web Laboratory: Publications
Papers that give an overview of the Web Lab
- Arms, W., Aya, S., Dmitriev, P., Kot, B., Mitchell, R., Walle, L., Building a Research Library for the History of the Web. Joint Conference on Digital Libraries, June 2006. MS Word.
- Arms, W., Aya, S., Dmitriev, P., Kot, B., Mitchell, R., Walle, L., A Research Library based on the Historical Collections of the Internet Archive. D-Lib Magazine, February 2006. http://www.dlib.org/dlib/february06/arms/02arms.html.
Techncial reports
Fall 2004
- Karthik Jeyabalan, Jerrin Kallukalam, Ariel Rabkin, Patrick Reilly, Nurwati Widodo, Web Research Infrastructure Project Final Report Fall 2004, December 17, 2004
Spring 2005
- Mayank Gandhi, Jimmy Yanbo Sun, ARC Data Extraction and summarization, May 2005
- Karthik Jeyabalan, Jerrin Kallukalam, Representation of Web Graph for in Memory Computation, May 2005
- Shantanu Shah, Generating a Web Graph, May 2005
- Richard Yu Wang, Web Research Infrastructure Database Section Semester Research Report, May 2005
Fall 2005
- Benzaquen, S., Guo, W., The Web Laboratory: Preload System, Fall 2005 Final Report. December 2005
- Gerner, N., Sosa, C., Fall 2005 Semester Report for Web Lab Database Load Group. December 2005.
- Gu, M.-D., User Tools: Basic Access API Design and Implementation. December 2005.
- Jain, P., Shtokman, D., Tiwari, H., Data Movement Research Project. December 2005.
- Kohli, S., Sanghi, L., Data Monitoring and Tracking. December 2005.
- Siddavanahalli, M., Singhal, S., Web Lab - Subset Extraction. December 2005.
- Shah, S., Retro Browser. December 2005.
Spring 2006
- Gerner, N., "WebLibrary Design Progress Report. May 2006
- Murarka, S., Web Graph Project. May 2006.
- Sosa, C. B., Jain, P., Shtokman, D., Web Library: Data Movement Spring 2006 Report. May 2006.
- Zhu, N., Basic Access API. May 2006.
Fall 2006
- Adil Aijaz, Heritrix WebLab. December 2006.
- Andrzej Kielbasinski, Data Movement and Tracking. December 2006.
- Dmitriy Shtokman, Web Library: Data Movement Fall 2006 Report. December 2006.
Spring 2007
- Laran Evans, Web Research Infrastructure. May 2007.
- Kyeongseo Hwang, Jung Kwan Kim and Hardeep Singh, Index to the History of the Web. May 2007.
- Kwan Dong Kim and Chang Min Kim, PageRank Calculation using Sparse Matrix in Clustered Computer Environment. May 2007.
- Sangwoo Kim, Sanjay Rajan, and Sean Seguin, Web Graph Generation. May 2007.
- Andrzej Kielbasinski, Data Movement and Tracking, Spring 2007 Report. May 2007.
- Dmitriy Shtokman, Web Library: Data Movement Spring 2007 Report. May 2007.
Fall 2007
- Asha Balasubramaniam and Dmitriy Shtokman, The Web Laboratory: Data Movement and Tracking Team Fall 2007 Report. December 2007.
- Wioletta Holownia and Michal Kuklis, WebLab Site Development and Researchers' Tools. December 2007.
- Anthony Jawad and Jie Teng, Web Graph Generation: Fall 2007 Report. December 2007.
- Chang Min Kim and Thomas Chen, PageRank Calculation. December 2007.
- Ashish Virmani and Neha Arora, Anchor Text Analysis. December 2007.
Spring 2008
- Asha Balasubramaniam, The Web Laboratory Project Data Movement and Tracking Report. May 2008.
- Vijayanand Chokkapu and Asif-ul Haque, PageRank Calculation using Map Reduce. May 2008.
- Prashant Baktha Kumara Dhas and Jasim Mohammed, An anchor text analysis of links to five state government websites for the years 2004 and 2005. May 2008.
- Wioletta Holownia, Michal Kuklis and Natasha Qureshi, Web Lab Collaboration Server and Web Lab Website. May 2008.
- Manu Jain, Gayatri Kaul and Aditi Lyall, Web Graph Generation. May 2008.
- Lokesh K Sharma,Sandeep S Shekhawat and Sneha Khadye, Data Profiler Tool. May 2008.
Summer 2008
- Manu Jain, Web Graph Generation. August 2008.
Fall 2008
- Jacob Bank and Benjamin Cole, Calculating the Jaccard Similarity Coefficient with Map Reduce for Entity Pairs in Wikipedia. December 2008.
- Nayan Busa, Unmesh Jagtap, and Utkarsh Prateek, PageRank Calculation using Map Reduce. December 2008.
- Xingfu Dong, Hubs and Authorities Calculation using MapReduce. December 2008.
- Zhiyu Zhang, Web Graph Cleaning using Map Reduce. December 2008.
Spring 2009
- Razen Alharbi, An Open Source GUI for Web Graph Cleaning on Hadoop. May 2009.
- Jacob Bank, Exploring the .gov Domain from 2002 to 2005: InDegree and OutDegree Analyses. May 2009.
- Chris Frommann, Analysis of OutLinks and Growth in the .gov TLD. May 2009.
- Oleg Krokhin, Efficient extraction of individual pages from a complete web crawl. May 2009.
- Pradeep Mani, Open Source suite for Page Rank calculation on Hadoop Cluster using Map-Reduce. May 2009.
- Shraddha Ladda and Subhashri Suresh, Weblab - .gov domain analysis. May 2009.
- Tom Ternquist and Jonathan Yu, An Investigation into SOLR and the Full-text Indexing of Books. May 2009.
- Aditi Vad, Pig Latin Evaluation through PageRank Algorithm Implementation. May 2009.