[MAGNOLIA-3989] Provide script in integration test to parse all delivered content. Created: 02/Mar/12  Updated: 23/Jun/14  Resolved: 20/Jun/13

Status: Closed
Project: Magnolia
Component/s: testing
Affects Version/s: 4.5
Fix Version/s: 4.5.1

Type: Improvement Priority: Neutral
Reporter: Robert Šiška Assignee: Robert Šiška
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Cloners
is cloned by MGNLEE-220 CLONE -Provide script in integration ... Closed
relation
is related to MGNLEE-220 CLONE -Provide script in integration ... Closed
is related to MAGNOLIA-4027 Do not hardcode username and password... Closed
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)

 Description   

Script will look for HTTP errors and FreemarkerErrors.



 Comments   
Comment by Robert Šiška [ 06/Mar/12 ]

The script is written in Groovy and uses NekoHTML library. It crawls the specified websites and downloads all pages, images, JavaScripts and CSS sheets. When it encounters any other HTTP status than 200, or finds FreeMarker error, it throws exception.

The web-pages to be crawled are specified either in command-line arguments, or project properties when run from maven.
The properties are in the form <geturl?>http://URL/</geturl?> or <geturlauth?>http://URL/</geturlauth?>. The latter one logs in with the superuser credentials. You need to define <login> and <password> properties, if you use authenticated crawling.

(P.S: it would be much better to get rid of <geturlauth> and use only <geturl> with login query, but properties apparently can't contain some special characters. Yes, I tried to escape AND use url encoding.)

The depth of search and other parameters can be changed only in the script.

For example, the properties can look like this:

<properties>
       <login>superuser</login>
       <password>superuser</password>
       <geturl1>http://localhost:8088/magnoliaTestPublic/ftl-sample-site/</geturl1>
       <geturl2>http://localhost:8088/magnoliaTestPublic/jsp-sample-site/</geturl2>
       <geturlauth3>http://localhost:8088/magnoliaTest/ftl-sample-site/</geturlauth3>
       <geturlauth4>http://localhost:8088/magnoliaTest/jsp-sample-site/</geturlauth4>
</properties>

The script throws RuntimeException when:

  • it gets any other HTTP status other than 200,
  • it finds page with Freemarker error or RenderException stacktrace,
  • it finds page, that is empty.
Comment by Robert Šiška [ 20/Jun/13 ]

Ignoring redirects to different hosts added.

Generated at Mon Feb 12 03:51:36 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.