I wanted to do something more but didn't know what to do or where to start. So, I decided to go back to the beginning and redo one of my first programming assignments, PageCheck. PageCheck is a program that takes an URL and a time interval and returns true if the corresponding page has been modified within the time interval, and false otherwise. This is the basic version.
The program below seems to be correct. For some webpages, it returns the correct vales. For other webpages, the values are not correct. These webpages don't seem to have a last modified date. There last modified date is 'Wed Dec 31 14:00:00 HST 1969.' These are webpages like Amazon, YouTube, and ics.hawaii.edu. getLastModified() Returns the value of the
last-modified
header field. The result is the number of milliseconds since January 1, 1970 GMT. The 1969 date is making last modified return 0.I DON'T GET IT?
1: import java.io.IOException;
2: import java.net.URL;
3: import java.net.URLConnection;
4: import java.util.Calendar;
5: import java.util.Date;
6:
7:
8: /**
9: * The CheckPage class implements a system to notify the user
10: * when a webpage has been modified within the past 24 hours.
11: *
12: * @author Keola Ng
13: * @version $Id$
14: */
15: public class CheckPage {
16:
17: /** The url connection object that connects to the web page to check. */
18: private URLConnection urlConnection;
19:
20: /** The time interval to check the web page. */
21: private int timeInterval;
22:
23: /**
24: * Constructs a valid PageCheck object otherwise throws an exception.
25: *
26: * @param urlString The url of the web page to check.
27: * @param timeInterval The time interval to check the web page.
28: * @throws IOException problem connecting to the url in urlString.
29: */
30: public CheckPage(String urlString, int timeInterval) throws IOException {
31: URL openUrl = new URL(urlString);
32: this.urlConnection = openUrl.openConnection();
33: this.urlConnection.connect();
34: this.timeInterval = timeInterval;
35: }
36:
37: /**
38: * Determines when the web page was last modified.
39: *
40: * @return The long value of last modified date.
41: */
42: public long getLastModified() {
43: return this.urlConnection.getLastModified();
44: }
45:
46: /**
47: * Checks if the URL has been modified within a specified time.
48: *
49: * @return URL has been modified before or after the timeInterval.
50: */
51: public boolean isModified() {
52: Calendar currentCalendar = Calendar.getInstance();
53: int intervalUnit = 3600000;
54: int lastModified = (int) (getLastModified() / intervalUnit);
55: int currentTime = (int) (currentCalendar.getTimeInMillis() / intervalUnit);
56:
57: int num = currentTime - lastModified;
58: System.out.println("Date: " + new Date(currentCalendar.getTimeInMillis()));
59: System.out.println("Last modified: " + new Date(getLastModified()));
60: System.out.println(getLastModified() + " / " + intervalUnit
61: + " = " + lastModified);
62: System.out.println(currentCalendar.getTimeInMillis() + " / "
63: + intervalUnit + " = " + currentTime);
64: System.out.println(currentTime + " - " + lastModified + " = "
65: + num + " < " + this.timeInterval);
66:
67: return ((currentTime - lastModified) < this.timeInterval);
68: }
69:
70: /**
71: *
72: * @param args
73: * @throws IOException
74: */
75: public static void main(String[] args) throws IOException {
76:
77: CheckPage cp = new CheckPage("http://www.ics.hawaii.edu/", 24);
78: if (cp.isModified()) {
79: System.out.println("True");
80: } else {
81: System.out.println("False");
82: }
83: }
84: }
1 comment:
i think that could be that those pages aren't static html pages. it could by some sort of dynamic page. and it doesn't give a last modified date.
Post a Comment