Working with PDF-Formatted Periodicals
Newspapers and magazines are important resources for foreign language
teachers, allowing them to integrate into the classroom setting authentic
materials which are current and interesting to both the students and to the
populations which speak the target language. A frequent problem for people who
want to use newspapers and magazines as teaching materials is that they are
often inconvenient and expensive to purchase. The fact that very many print
publications around the world now maintain web sites, making available all or
most of the content of the print editions available for free at the fingertips
of anyone anywhere in the world with access to a web browser, makes this less
of a problem, provided that the teacher has the skills to manipulate the web
content to distribute to his or her students in a useful format.
While most web content, including that of sites associated with print
periodicals, is in HTML (web page format), many newspapers and some magazines
are available on the Internet as PDF files. This format has some distinct
advantages for the teacher who wants to use the content to create printed
instructional materials. This page will compare some of the features of the PDF
format with those of both the HTML (web page) formats and popular image formats
(such as JPEG) and explain how to "clip" an article from a PDF files to create
language teaching materials.
File Formats: PDF, HTML, and Images
PDF format. The Portable Document Format, or PDF, was created by Adobe,
makers of the popular software products Adobe Acrobat and Adobe Photoshop. More
information about the PDF format can be found
here on the Adobe web site. A PDF file is a relatively compact file which
can contain text both images. A PDF file will look the same on the screen and
in print, regardless of what kind of computer or printer it is viewed on and
regardless of what software package was originally used to create it. In this
respect, it is like an image file (JPEG, GIF, etc.). But unlike an image file,
which will appear blurry if enlarged to any size larger than its original size,
a PDF file usually contains text and other content which remains sharp and
clear regardless of what size it is enlarged to.
Periodicals available on the web in PDF format are generally broken down so
that each page is contained in a separate PDF file. The PDF format is generally
an exact copy (sometimes with advertising omitted) of the print version with
respect to both content and layout. Use of PDF versions of periodicals can
therefore give the learner the experience of dealing with authentic printed
magazines and newspapers.
PDF files can be read on Windows, OS X, and Linux using the free Adobe Acrobat
Reader program and web browser plug-in, available
here, as well as by some other third-party and free software. Chances are
that Acrobat Reader is already installed on your computer.
Web pages in HTML format. HTML (Hypertext Markup Language) and
derivatives like XHTML are the formats used for most web pages, such as this
one. HTML pages are rendered directly by the browser rather than via a plug-in,
and can contain text, images, and form elements (such as checkboxes, buttons,
and text input fields). The way in which HTML treats text has a few features
which are useful to the teacher. For one, the text is rendered in fonts whose
sizes can usually be enlarged or made smaller by the reader by changing the
browser's settings. Another feature is that blocks of text can usually be
selected with a mouse and copied into a word processor for reformatting. Many
print periodicals maintain sites which post content reformatted as web pages in
HTML format. While this reformatted content is ideal for reading directly from
a computer screen, using articles reformatted in this way has certain
disadvantages for the teacher who wants to print out the material for use in
Essential and non-essential formatting elements. It is usually difficult
or impossible to print out web pages in a way which both preserves essential
formatting elements such as placement of images related to the text, without
also retaining elements which are annoying in the print medium, such as space
wasted by banners and sidebars and inappropriate font sizes.
Inappropriate elements. Web page elements which simply do not translate
into the non-interactive medium of print either annoy the reader, as in the
case in hyperlinks, or waste valuable space, as in the case of buttons,
checkboxes, input fields, and other form elements.
No pagination. HTML is not a paginated format. A "page" of Web content
is intended to be viewed as one continuous scroll, regardless of the length of
the content, and browsers offer little or no control over how these scrolls are
broken up into multiple physical pages for printing.
Web look-and-feel. Print-outs of HTML formatted material usually have
the feel of a web page, rather than of a magazine or some other printed format.
Articles printed out from a browser do not give the learner the feeling of
having worked with a real newspaper or magazine. Conversely, articles culled
from web pages and reformatted in a word processor give the student the feeling
of working with specially prepared instructional materials rather than with
These disadvantages can be summarized by saying that HTML documents are
designed to look like web pages and are intended to be experienced using a web
Pages saved as images. There are a few web sites which offer archived
versions of printed magazines in image formats such as JPEG. These are
generally not of much use to the language teacher. The reason is that to obtain
a reasonably small file size, the publisher must save the image with a
relatively low resolution. This means that if you try to enlarge the image to
make the print easier to read for your students, the image becomes blurry, and
hence just as hard to read as the original size. Frequently, such archives are
created by breaking down the printed page into several image strips rather than
as a single image. While such a page will display correctly in your web
browser, it is difficult and impractical to manipulate in any other computer
program, because each of the strips must be saved separately then reassembled
in the other program.
The following tips will assume that you have installed Adobe Acrobat Reader
and can open PDF files. In addition to this, you will need a graphics program
to manipulate the image we capture from the Acrobat Reader program. Windows XP
has the Paint utility, located under Accessories. If you would like a more
sophisticated graphics program, or you are not using Windows XP, you can
download the free, open source GIMP graphics program from
GIMP.org. Versions of the the GIMP are available for Windows, OS X,
Linux, and other platforms.
Clipping an article from a PDF page. The following steps describe how
to clip an article from a PDF file to produce an image which you can either
print out or embed in a word processor document to enrich with additional
material. Follow these steps:
Open the PDF file in Acrobat Reader and find the article you want to clip. (If
Acrobat Reader is properly configured as a browser plug-in, you should be able
to open the file by just clicking on the link to the PDF file in your browser.)
Here's what a PDF document will look like opened in a browser using the Acrobat
Zoom in close so that the text of the article is rather large. The size and
resolution of the image we are going to make is dependent on the resolution and
size of the view on the screen. We want the print to be large and clear enough
by zooming in on it at this point:
Select the Acrobat Reader snapshot tool:
This will allow you to create a square selection of the PDF page on your
screen. Click and hold down at the top left-hand point of the area you
want to clip. Drag your mouse to the lower right-hand corner of the area you
want to clip. If that area is below the bottom of the window, drag the cursor
below the bottom of the window and Acrobat Reader will slowly scroll the
document in that direction. When you have reached the lower right-hand corner
of the area you want to clip, let go of the mouse. When you do this, a dialog
box will appear telling you that the area has been pasted to the clipboard:
Now open your graphics program to paste the image from the clipboard.
If you're using Windows XP Paint, paste the image into a blank document and
save in PNG or JPEG format.
If you're using the GIMP, from the menu select Edit > Acquire > From Clipboard.
Save the image in PNG or JPEG format.
At this point you can either manipulate (scale, trim, rotate, etc.) and print
the image from within your graphics program or import the image into a word
processor file to add additional material such as vocabulary notes,
bibliographic information, and questions.
This work is licensed under a Creative Commons License.
- You may use and modify the material for any non-commercial purpose.
- You must credit the UCLA Language Materials Project as the source.
- If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one.