Page tree
Skip to end of metadata
Go to start of metadata

The Link Mapper module records incoming 404 requests in realtime. Editors can respond to each 404 with either 200 (hide the error), 302, 301 (redirects), or 410 (gone) to tell servers to stop looking for it.

Installation

Maven is the easiest way to install the module. Add the following dependency to your bundle:

<dependency>
  <groupId>info.magnolia.linkmapper</groupId>
  <artifactId>magnolia-linkmapper</artifactId>
  <version>${linkMapperVersion}</version>
</dependency>

Be aware that the app has changed completely between Version 3.3 and 3.4. If you installed a previous version of the app, please remove the module first from your project to prevent issues.

Versions

3.4.1Magnolia 6.2ready for Quarkus backend
3.3Magnolia 6.2ready for Quarkus backend
3.2.2Magnolia 6.1ready for Quarkus backend
3.2Magnolia 6.1ready for Elastic backend
3.1Magnolia 5.7
3.0Magnolia 5.6

Usage

Once installed and configured the link mapper module stores data on a 3rd party server. Data is being collected by the public instance(s) each time a 404 is detected. This data can then be consumed by the author instance (on-demand) in the 404Links app. Editors can examine the data and decide what redirect action (response) should be taken. These redirects can be published to the public instances much the same way virtual URI mappings are done. The difference here is that the virtual URI mappings are handled prior to rendering while the 404 redirects are handled after rendering.

Configuration

clientIdentifier

required

To use the link mapper module you will need to obtain a client identifier from Magnolia. This identifier is used to track the data obtained by the filter. This property is set in the module config.

token

required

To be able to authenticate at the collection server you need the JWT token which can be found in the console after the startup of the collection server.

baseUrl

required

The base URL for the 3rd party service collecting the data. Also obtained from Magnolia along with the client identifier.

Up to version 3.2.2:

This property is set on the linkMapperService a node in the rest-client registry.

From version 3.3:

Since the rest client is defined as YAML from version 3.3 on it can be set with a decorator.

<decorating-module-name>/decorations/linkmapper/restClients/linkMapperService.yaml
baseUrl: 'http://localhost:8090/lima/v1/'

404Links app

From the author instance, editors can use the 404Links app to view the data collected on the public instances by the broken links filter. Data is collected from the server using the Reload action. Broken links are displayed along with the corresponding site name and access count. Results can be filtered and sorted in a variety of different ways.

Each entry can be published to the public instances. Use the well known Publish action to trigger the publication. With the Archive Action, it is possible to move the link to the Archive Tab. If the link is accessed further the node will be automatically get unarchived after the next Reload.

The publication of a node is only available after setting the Target and Redirect Type.

There is also an action available from the browser view to quickly blacklist an item.

Archive

The archive list is there to ensure clarity in the main list even with longer use. The same search and filter functions are available. To edit a link, however, it is necessary to unarchive it. If an archived link is accessed again, it is automatically unarchived with the next Reload.

Dialog

The edit dialog offers the possibility to define a redirect type and a redirect target. It is also possible to evaluate a list of referers and query parameters.

Link name
Name of the Link showed in the Table
Target
  • Page - Choose a page from the pages app
  • Internal link - Enter an internal link (e.g. /travel)
  • External link - Enter an external link (e.g. http://myexternalsite.com)
  • Blacklist - Do not redirect to any page. Returns HTTP status code 410 instead. (Choose this option in combination with the redirect type 410)
Redirect type
  • 200: keep URL - Target will be served on the original URI, which may look like a page duplicate.
  • 301: permanent - Use in case the original URI will never exist again (default).
  • 302: temporary - Use in case the original URI will exist again in the future.
  • 410: blacklist - Announce that this page is permanently gone and not likely to ever appear again.

Original Name
Original name from first reload
Original URI
URL which was causing the 404 error



When editing an item you can see referrers and query parameters of the request that led to the 404. This is sometimes helpful to fight broken links coming from within the system.

Collection server


The quarkus collection server only works with version >=3.2.2 see version info!

The collection server was developed using the Quarkus framework and is available for download from the link below. The server includes a docker file to create a deployable container.

As a database for collecting the information, a PostgreSQL server is needed. The connection to this database can be passed by parameter or configured in the server in the file application.properties.

Repository

linkmapper-quarkus-postgresql

Configuration

The following parameters can be configured:

quarkus.log.level

required, default is INFO

The log level of the root category, which is used as the default log level for all categories.

quarkus.http.port

required, default is 8090

The HTTP port.

quarkus.datasource.username

required

The PostgreSQL datasource username.

quarkus.datasource.password

required

The PostgreSQL datasource password.

quarkus.datasource.reactive.url

required

The PostgreSQL datasource URL.

quarkus.datasource.reactive.max-size

required, default is 20

The datasource pool maximum size.

linkmapper.schema.create

required, default is true

If set to true the server will drop all tables in the datasource and recreate them on server startup.

linkmapper.jwt.generate.new.key.pair

required, default is false

If set to true the server creates a new public/private key pair for the jwt authentication on server startup.
When this happens every token generates before is getting invalid.
On the first server startup, a fresh key pair will be generated even if this property is set false. 


Security

At every server start, a valid jwt token is generated and written to the server console. Tokens are valid until a new public/private key pair is generated. (see collection server configuration)
For the connection to the collection, server to work a valid token must be entered in the Magnolia backend. (see configuration)

Warnings

  • This module is at the INCUBATOR level.
  • Redirects are evaluated after rendering, so pages treated this way will take more time by needing to go through the filter chain twice (About 20-50ms penalty). OTOH, it doesn't slow anything else no matter how many you have (in difference from virtual URI mappings).
  • You may run into slowness when loading large amounts of data into the JCR from the collection server.

Changelog

  • Version 3.2
  • Version 3.1
    • Updated for Magnolia 5.7 compatibility. 
    • MBLINKS-17 - Getting issue details... STATUS
  • Version 3.0 - Initial release of the extension's version of the module.

1 Comment

  1. Does the Link Mapper distinguish between bots vs. people that try to access a dead link?