## BOSCH IoT INSIGHTS - Processing Pipeline - Embedded GHC

This example uses `patchelf` to avoid /proc/self/exe problems.

The image (`FROM docker.io/haskell:slim-buster`) makes the Glasgow Haskell compiler
available in the processing pipeline. It expects an uploaded input with a field named "expression"
in the payload and tries to execute it.

### Build the ghci_image with its Dockerfile
```sh
docker build -t ghci_image -f ./resources/Dockerfile .
```

### Create the image, which includes building the file system for the image
```sh
CONTAINER_ID=$(docker create ghci_image) && echo $CONTAINER_ID
```

### Export the file system from the image and compress it as xz-file
```sh
docker export $CONTAINER_ID | xz > ./resources/distro_flat.xz
```

### Adapt the Python example to the different environment (ghci compiler)

```sh
docker run --rm -it -v "/$(pwd)/../../scripts:/scripts" ghci_image bash -c "./scripts/create_environment.sh"
```

The `constant.py` for this example must contain the following values:

```python
# Paths for the specific loader library from your own filesystem must be adapted
OWN_ELF_LOADER = "$(pwd)/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2"
# Path to the required and installed fakechroot library must be adapted
OWN_PRELOAD = "$(pwd)/usr/lib/x86_64-linux-gnu/fakechroot/libfakechroot.so"
# Paths to all the necessary used library paths in your own filesystem must be adapted
OWN_LIBRARY_PATHS = "$(pwd)/usr/local/lib:$(pwd)/usr/local/lib/x86_64-linux-gnu:" \
                    "$(pwd)/lib/x86_64-linux-gnu:$(pwd)/usr/lib/x86_64-linux-gnu"

# Path to the extracted filesystem relative to the current working directory after startup
OWN_FILE_SYSTEM_DIR = "./distro"
```

### Calling the executable from a Python step

Field `payload.expression` is read and passed to the ghc compiler that executes the expression.
The result is added to the payload as a field named `result`.

```python
async def invoke(self, exchange_data_holder: ExchangeDataHolder) -> ExchangeDataHolder:
    print("\nStart processing " + str(exchange_data_holder))

    for document in exchange_data_holder.documents:
        expression = str(document['payload']['expression'])
        document['payload']['result'] = run(create_env_variables() + ' ' +' $(pwd)/opt/ghc/9.4.5/lib/ghc-9.4.5/bin/ghc-9.4.5 -e "' + expression + '"', constant.OWN_FILE_SYSTEM_DIR)
        # error message and trace should show up in the Insights Console
        if (document.get('metaData') or {}).get('hasError', None):
            # pass
            raise Exception('My Error')
            # also possible is exchange_data_holder.error = "Error"

    print("processed %s\n" % exchange_data_holder.id)

    return exchange_data_holder
```

### Package the zip file for this example
Run following commands to prepare the content of the folder to zip.
```sh
mkdir -p ../../target
cp -r ../../src ../../executable-manifest.yaml ./resources ./src ../../target
```

If you have `zip` installed, you can run the following:
```
cd ../../target && zip -r ghci-example.zip * && cd -
```

Otherwise, you can also create you zip archive manually by selecting **the content** of the `target` folder 
(do not select the folder itself!) and adding it to a zip archive.

**Hint**: Modify the executable-manifest.yaml and increase the version if you modify the example and upload it once more.

## Problems

### Missing or wrong reference to `/proc/self/exe`

For example, if you call an executable like `ghc` in our environment from your own extracted file system, you would 
likely call it like this in a Python `subprocess.run` command:

```sh
FAKECHROOT_ELFLOADER=$(pwd)/lib/x86_64-linux-gnu/ld-2.28.so 
FAKECHROOT_BASE=$(pwd) 
LD_PRELOAD=$(pwd)/usr/lib/x86_64-linux-gnu/fakechroot/libfakechroot.so 
LD_LIBRARY_PATH=$(pwd)/lib/x86_64-linux-gnu:$(pwd)/usr/lib/x86_64-linux-gnu/ 
$(pwd)/lib/x86_64-linux-gnu/ld-2.28.so $(pwd)/opt/ghc/9.4.5/lib/ghc-9.4.5/bin/ghc-9.4.5 -e "3+4"`
```

This will cause the error: `Missing file: /lib/lib/settings"` in the App Console, because `ghc` is not started directly. 
Instead, the loader is started and `/proc/self/exe` will point to the path where `ld-2.28.so` is located and not to the 
path where `ghc` is located. `ghc` tries to resolve its own location with the symlink `/proc/self/exe`, which points to 
the wrong path `$(pwd)/lib/x86_64-linux-gnu/`. It calls the parent directory `$(pwd)/lib/` and because `$(pwd)` is used 
as fakechroot, it finally resolves to the directory `/lib/` as the home directory of `ghc`.
With this wrong path, it tries to find a file inside this directory which resolves to `/lib/lib/settings`, which does not exist because of the wrong home directory.
The expected file is located at `$(pwd)/opt/ghc/9.4.5/lib/ghc-9.4.5/lib/settings` instead.

To fix such kind of problems, you could add a modified symlink for `$(pwd)/proc/self/exe` in your extracted file system:
You need to execute this in your Python `main` method of your step.py, in order to resolve `($pwd)` correctly: 

```sh
unlink $(pwd)/proc
mkdir $(pwd)/proc/self
touch $(pwd)/proc/self/exe
ln -vsf $(pwd)/opt/ghc/9.4.5/lib/ghc-9.4.5/bin/ghc-9.4.5 $(pwd)/proc/self/exe
```

The other option would be to use `patchelf`. Then your executable will be called directly and not via the loader.
If you call it directly, you must be sure that the directory `$(pwd)/proc` inside of your extracted file system
points to the real system `/proc` directory that is used from processing pipeline. Normally, the link is created  
in our examples in the Python method `fix_symbolic_links`. You will find the method in embedded_linux.py.

Otherwise, you will get errors like: `/proc/self/exe: readSymbolicLink: does not exist (No such file or directory)`.
Then you need to check the symlink that points from your file system `$(pwd)/proc` to the outer absolute `/proc/`
directory:

```sh
ln -vsf /proc $(pwd)/proc
```

### Using patchelf instead of running the executable via the loader.

As already mentioned if you have problems with `/proc/self/exe` or files that are not found at their locations,
you could try to patch the loader in your executable with `patchelf`. So, instead of telling the loader to run your
executable, you will change the address of the used loader inside your executable with `patchelf`.

To change the loader inside your executable, you have to run following command line:
```sh
patchelf --set-interpreter $(pwd)/lib/x86_64-linux-gnu/ld-2.28.so $(pwd)/opt/ghc/9.4.5/lib/ghc-9.4.5/bin/ghc-9.4.5
```
You find a Python example how to call it in your own step.py in the file other_examples/ghci/step.py

With this change, you could start your executable directly like that:

```sh
FAKECHROOT_ELFLOADER=$(pwd)/lib/x86_64-linux-gnu/ld-2.28.so 
FAKECHROOT_BASE=$(pwd) 
LD_PRELOAD=$(pwd)/usr/lib/x86_64-linux-gnu/fakechroot/libfakechroot.so 
LD_LIBRARY_PATH=$(pwd)/lib/x86_64-linux-gnu:$(pwd)/usr/lib/x86_64-linux-gnu/ 
$(pwd)/opt/ghc/9.4.5/lib/ghc-9.4.5/bin/ghc-9.4.5 - e "4+7"
```

This example will call he ghc and execute the expression "4+7" as a mathematical expression. 

### Invalid elf_header

If you get errors like `libc.so is not an ELF file - it has the wrong magic bytes at the start` or
`libc.so: invalid ELF header`, you could try to analyse the file with the invalid ELF header directly inside
your image.

#### Prerequisite:

Your image needs to contain the executable called `file`. You may need to add the following line to your `Dockerfile`.

```Dockerfile
RUN apt-get update && apt-get install -y --no-install-recommends file \
    && apt-get clean && rm -rf /var/lib/apt/lists/* # Clean up to keep the image size as small as possible
```

Analyse it directly in your docker image (image name here: `ghci_image`):

```sh
docker run --rm -it ghci_image bash -c "file /usr/lib/x86_64-linux-gnu/libc.so"
```
```
# Expected output
/usr/lib/x86_64-linux-gnu/libc.so: ASCII text
```

Analyse it inside Insights' processing pipeline. Because the executable `file` is not installed in our processing
pipeline frame, you need to call the executable `file` the same way as you would try to call your desired executable.
Therefore, you could use the `run_with_own_loader` method and execute `file` with the path to the file with
invalid ELF header:

```python
document['metaData.debug']['file_libc'] = run_with_own_loader('$(pwd)/usr/bin/file /usr/lib/x86_64-linux-gnu/libc.so')
```

**Hint**: Because `run_with_own_loader` already creates a fakechroot for the current directory, we specify the file path
to `libc.so` without the `$(pwd)` as an absolute file path inside the extracted file system.

If it is not an ELF file, you could try to find a replacement for this library in your image (image name here: `ghci_image`):
In the example other_example/ghci/step.py we do this for `libc` and also for `libm` because on start of ghc ot tried to
read these ASCII files as ELF files. 

```sh
docker run --rm -it ghci_image bash -c "find / -name libc*.so*"
```
```sh
# Expected output
...
/lib/x86_64-linux-gnu/libc-2.28.so
...
/lib/x86_64-linux-gnu/libc.so.6
/usr/lib/x86_64-linux-gnu/libc.so
```

Alternatively, run it inside our processing pipeline frame:

```python
document['metaData.debug']['find_libc'] = run('find / -name libc*.so*', constant.OWN_FILE_SYSTEM_DIR)
```

In our case, the libc.so is also available by a symbolic link (`/lib/x86_64-linux-gnu/libc.so.6`) that points to
`/lib/x86_64-linux-gnu/libc-2.28.so`. Therefore, we could try to create a new symbolic link from
`/usr/lib/x86_64-linux-gnu/libc.so` -> `/lib/x86_64-linux-gnu/libc-2.28.so` or `/lib/x86_64-linux-gnu/libc.so.6`

```sh
ln -svf $(pwd)/lib/x86_64-linux-gnu/libc-2.28.so $(pwd)/usr/lib/x86_64-linux-gnu/libc.so
```

This needs to be executed in the Python step inside the `main` method. Then, the `invalid elf_header` error message 
should be fixed. Maybe you will have to do it also for other libraries.
