Training, Open Source computer languages

PerlPHPPythonMySQLhttpd / TomcatTclRubyJavaC and C++LinuxCSS

Search our site for:
Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
Please visit
http://www.wellho.net/resources/P667.html
for the latest update to this page

Related technical and longer articles
Data Monging
Well House Consultants
You are on the site of Well House Consultants who provide Open Source Training Courses and business hotel accommodation. You are welcome to browse and use our resources subject to our copyright statement and to add in links from your pages to ours.
Other subject areas - resources
Java Resources
Well House Manor Resources
Perl Resources
Python Resources
PHP Resources
Object Orientation and General topics
MySQL Resources
Linux / LAMP / Tomcat Resources
Well House Consultants Resources
Extras Resources
C and C++ Resources
Ruby Resources
Tcl/Tk Resources
Web and Intranet Resources
Handling Huge Data - Perl module P667
If you've so much data that it won't all fit into memory all at once, you may not be able to use conventional programming techniques to complete your task. We define a data set such as this as "huge data"; it's impossible to handle in some languages, but very practical in Perl. This module doesn't introduce many new language features; instead, it shows you how to use what you already know to handle huge data practically.

This topic is presented on public course Perl for Larger Projects

Examples from our training material
behind   looking behind in huge data files
huge1   A program to test handling a small part of a huge data set
huge2   Providing user feedback while handling huge data
huge3   Asking a long running application for intermediate reports
huge3.pid   Example of the huge.pid file
makedirs   Preprocessing a huge data file to set up indexes
makeindex   Generating a list of markers to a huge sorted data set
opt2   Sorting and data filtering efficiency
opt3   Improving sort efficiency
opt4   Improving sort efficient further - caching record analysis
optim   Optimising code to avoid repeating calculations
out.txt   Example of search results written to file
readtime   Efficiency - reading a file in large blocks
reg_opt   Regular expression match - inefficient example
reg_opt1   Regular expression match - don't save $' $` and $&
reg_opt2   Regular expression match - use of "o" modifier
reg_opt3   Regular expression match - more specific and faster
reg_opt4   Regular expression match - a start assertion speeds it up!
rt2   Handling data in chunks - chunk overlap issue solved
site.pm   Class used in other examples in this module
useindex   Grab first ten sites on a topic area - QUICKLY via index
Opentalk forum discussions related to this topic
max array size
Regular Expression Efficiency
Reduce the time taken for Huge Log files
Monitoring a Perl program
Filter Large Log Files
Fastest way to replace chars
Pictures
A happy trainee
Background information
Some modules are available for download as a sample of our material or under an Open Training Notes License for free download from http://www.training-notes.co.uk.
Topics covered in this module
What is a huge amount of data?
Planning.
General techniques.
Code optimisation.
Regular expressions.
Sorting.
Avoiding loops.
Storing data in memory.
"Hello Huge World".
User feedback.
Signals and tails to monitor and control a long process.
Reading the data by line or by block.
Arranging and storing the data.
Using a directory structure.
Indexing.
Course links
The following web pages are used as references / examples in this module

[Link] Race Car Tech: race car tech articles.
(at http://www.racecartech.com/)

[Link] Sports - MotorSports - Grizzly Web Links
(at http://grizzlyweb.com/links/sports_racing.asp)

[Link] INACCESSIBLE
(at http://www.pubbersparadise.com/)

[Link] INACCESSIBLE
(at http://www.yocar.com/racing.html)

[Link] The Supercar Experience
(at http://www.thesupercarexperience.com.au/)

[Link] veescene.com
(at http://www.veescene.com)

[Link] MOTORKHANA WORLD Wide Web - December 9, 2002
(at http://www.geocities.com/MotorCity/4130/index.htm)

[Link] INACCESSIBLE
(at http://www.racedesert.com)

[Link] Nick's Demolition Derby Page
(at http://www.geocities.com/motorcity/speedway/1789/)

[Link] PaceCars.com
(at http://www.pacecars.com/)

[Link] No Title
(at http://dmoz.org/rdf/content.rdf.u8.gz)

[Link] Psychintegrator Plants, Art, Ethnobotany & Therapy -seminar homepage-
(at http://lycaeum.org/~entheos)

[Link] Sword Sex -- young amazon girls with knives erotica
(at http://www.sword-sex.com/)

[Link] INACCESSIBLE
(at http://www.sporting-girls.com/)

We check these links from time to time with a spider written in PHP. Latest full check was on Saturday, 12th June 2004. Titles are extracted from the web pages listed.

Complete learning
If you are looking for a complete course and not just a information on a single subject, visit our Listing and schedule page.

Well House Consultants specialise in training courses in Python, Perl, PHP, and MySQL. We run Private Courses throughout the UK (and beyond for longer courses), and Public Courses at our training centre in Melksham, Wiltshire, England. It's surprisingly cost effective to come on our public courses - even if you live in a different country or continent to us.

We have a technical library of over 600 books on the subjects on which we teach. These books are available for reference at our training centre. Also available is the Opentalk Forum for discussion of technical questions.



WELL HOUSE CONSULTANTS LTD.: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 0800 043 8225 or 01225 708225 • FAX: 0845 8382 405 or 01225 707126 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho