Raw data as input (non-process mining ready)

Let's say we have a certain file that we want to consider for processing using Beamline but this file does not meet any of the sources already implemented. Then, this example shows how to process such a file using Beamline.

For the sake of simplicity let's consider a file where each line refers to one event but, within the line, the first 3 characters identify the case id, while the rest is the activity name. This is an example of such a file (where 001 and 002 are the case ids, and ActA, B and Act_C are the activity names):

1
2
3
4
5
6
002ActA
001ActA
002B
002Act_C
001B
001Act_C

To accomplish our goal, we need first to define a source capable of processing the file:

BeamlineAbstractSource customSource = new BeamlineAbstractSource() {
   @Override
   public void run(SourceContext<BEvent> ctx) throws Exception {
      Files.lines(Path.of(logFile)).forEach(line -> {
         String caseId = line.substring(0, 3);
         String activityName = line.substring(3);

         try {
            ctx.collect(BEvent.create("my-process-name", caseId, activityName));
         } catch (EventException e) { }
      });
   }
};

Now, a stream of BEvents is available and can be processed with any miner available, for example, using the Trivial discovery miner:

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env
   .addSource(customSource)
   .keyBy(BEvent::getProcessName)
   .flatMap(new DirectlyFollowsDependencyDiscoveryMiner().setModelRefreshRate(1).setMinDependency(0.1))
   .addSink(new SinkFunction<ProcessMap>() {
      @Override
      public void invoke(ProcessMap value, Context context) throws Exception {
         value.generateDot().exportToSvg(new File("src/main/resources/output/output.svg"));
      };
   });
env.execute();

In this case, we configured the miner to consume all events and, once the stream is completed (in this case we do know that the stream will terminate) we dump the result of the miner into a file output.svg which will contain the following model:

G e6def0aa8-a25c-48d9-8be2-b6873af41849->e6fd0d0c8-9bf8-41b4-b32c-27d996d529ec 1.0 (2) e6fd0d0c8-9bf8-41b4-b32c-27d996d529ec->efed7b698-4a16-44d5-99db-2afa47a4c8e4 1.0 (2) efed7b698-4a16-44d5-99db-2afa47a4c8e4->e05520936-aa17-4739-bad0-7d560003a923 ef4d5585f-76e4-451b-badf-d504407d9581->e6def0aa8-a25c-48d9-8be2-b6873af41849 e6def0aa8-a25c-48d9-8be2-b6873af41849 ActA 1.0 (2) e6fd0d0c8-9bf8-41b4-b32c-27d996d529ec B 1.0 (2) efed7b698-4a16-44d5-99db-2afa47a4c8e4 Act_C 1.0 (2) ef4d5585f-76e4-451b-badf-d504407d9581 e05520936-aa17-4739-bad0-7d560003a923

The complete code of this example is available in the GitHub repository https://github.com/beamline/examples/tree/master/src/main/java/beamline/examples/rawData.