Tamizh in words

Implementing API Gateway Pattern in fsharp using Hopac

Published on
Read Time
· 18 min read

Two years back, I wrote a blog post on how to implement the API Gateway pattern using Reactive Extensions(Rx). In this third part of concurrent programming in fsharp using Hopac blog series, we are going to revisit that blog post and port it to use Hopac instead of Reactive Extensions.

Rx vs Hopac

The critical difference between Rx and Hopac is their communication model.

Rx is primarily a system for querying data in motion asynchronously, and systems built using Rx rely on asynchronous message-passing. Whereas Hopac's programming model uses synchronous message passing (Rendezvous Point) using channels.

If you are interested in knowing more about the difference, check out this Rich Hickey's talk on introducing Clojure core.async

The Problem Statement

Let's get started by spending some time on understanding the problem that we are going to solve.

Our objective is to write a backend which serves the data for showing a GitHub user profile like below

This profile view has three components.

User - Username and Avatar.

Popular Repos - Top three public repos of the Given User (determined by the number of stars).

Languages - Programming languages used in the corresponding repos.

To get these data from GitHub APIs, we need to make five API calls.

We can use GitHub's GraphQL API to minimise it. As the focus of this blog post is different, we are not going to consider this.

Getting Started

Let's create a fsharp script file ApiGateway.fsx and add the following NuGet packages using Paket as we did in the earlier parts.

> touch ApiGateway.fsx
> forge paket init
> forge paket add Hopac
> forge paket add Http.fs # <1>
> forge paket add System.Net.Http # <2>
> forge paket add FSharp.Data # <3>

<1> Http.Fs - A simple, functional HTTP client library for F# using Hopac

<2> System.Net.Http - A dependency of Http.Fs

<3> FSharp.Data - For using JSON Type Provider and JSON serialization.

Then refer these packages in the script file

#r "packages/Hopac/lib/net45/Hopac.Core.dll"
#r "packages/Hopac/lib/net45/Hopac.Platform.dll"
#r "packages/Hopac/lib/net45/Hopac.dll"
#r "packages/Hopac/lib/net45/Hopac.dll"
#r "packages/FSharp.Data/lib/net45/FSharp.Data.dll"
#r "packages/Http.fs/lib/net461/HttpFs.dll"
#r "packages/System.Net.Http/lib/net46/System.Net.Http.dll"

open Hopac
open FSharp.Data
open HttpFs.Client

Setting up GitHub Http GET request

According to GitHub API version 3 specification, the presence of User-Agent header is mandatory for all API requests.

As we are going to use only HTTP GET requests, let's create a function httpGet that takes care of passing User-Agent header in all the HTTP Get requests

// string -> Job<string>
let httpGet url =
Request.createUrl Get url // Request
|> Request.setHeader (UserAgent "FsHopac") // Request
|> getResponse // Alt<Response> // <1>
|> Job.bind Response.readBodyAsString // Job<string> // <2>

<1> getResponse is a function from Http.Fs library that fires the HTTP request and returns the Response as Alt, a subclass of Hopac's Job.

<2> As the name indicates, the readBodyAsString function (Response -> Job<string>) read the response body and return its string representation.

<2> The bind function creates a job that first runs the given job (Job<Response>) and then passes the result of that job (Response) to the given function (Response -> Job<string>) to build another job which will then be run.

val bind: ('x -> Job<'y>) -> Job<'x> -> Job<'y>

Getting User

To parse the user information from GitHub, let's create the JSON Type Provider for the User.

type UserTypeProvider = JsonProvider<"https://api.github.com/users/tamizhvendan">
type User = UserTypeProvider.Root

Then, we can make use of the httpGet function that we defined earlier to get the User JSON response from GitHub and parse the response using the UserTypeProvider.

let basePath = "https://api.github.com"

// string -> string
let userUrl = sprintf "%s/users/%s" basePath

// string -> Job<User>
let getUser username : Job<User> =
userUrl username
|> httpGet // Job<string>
|> Job.map UserTypeProvider.Parse // Job<User> // <1>

<1> The Job.map function creates a job that runs the given job (Job<string>) and maps the result of the job (string) with the given function (string -> User).

val map: ('x -> 'y) -> Job<'x> -> Job<'y>

We can verify the getUser function using the F# interactive

> getUser "tamizhvendan" |> run;;
val it : User =
{
"login": "tamizhvendan",
"id": 1128916,
"avatar_url": "https://avatars0.githubusercontent.com/u/1128916?v=4",
...
}

Getting Top Three User Repo

Our next task is getting the top three repos of the user based on the count of stars. As we did for User, let's start by defining the Repo type using the JSON Type Provider.

type ReposTypeProvider =
JsonProvider<"https://api.github.com/users/tamizhvendan/repos">
type Repo = ReposTypeProvider.Root

Then, we have to add a function that returns the top three repos from the given repos.

// Repo [] -> Repo []
let topThreeUserRepos (repos : Repo []) =
let takeCount =
let reposCount = Array.length repos
if reposCount > 3 then 3 else reposCount
repos
|> Array.filter (fun repo -> not repo.Fork) // Repo shouldn't be a fork
|> Array.sortByDescending (fun repo -> repo.StargazersCount)
|> Array.take takeCount

The final step is wiring up the getTopThreeUserRepo function with the help of the functions that we defined so far.

let userReposUrl = sprintf "%s/users/%s/repos?per_page=100" basePath

// string -> Job<Repo []>
let getTopThreeUserRepos username : Job<Repo []> =
userReposUrl username
|> httpGet // Job <string>
|> Job.map ReposTypeProvider.Parse // Job <Repo []>
|> Job.map topThreeUserRepos // Job <Repo []>

We can verify this in F# interactive as well

> getTopThreeUserRepos "tamizhvendan" |> run;;
val it : Repo [] =
[|{
"id": 12037577,
"name": "blog-samples",
"full_name": "tamizhvendan/blog-samples"
...]

The list user repositories GitHub API that we are using here returns a maximum of 100 repositories per page, and we need to use their pagination logic to fetch all the repositories.

We are going ahead with this underlying implementation, and we'll address pagination later in this blog post.

Getting Repo Languages

The GitHub API for returning a repo's languages has the following JSON structure

{
"F#": 65237,
"JavaScript": 25034,
"Shell": 876,
"HTML": 391,
"Batchfile": 214
}

From the response, we can see that the languages API response doesn't have a fixed schema. So, to get the languages, we need to parse the JSON and pick only its keys.

We can make use of the JsonValue.Parse function from FSharp.Data library to do it.

// string -> string []
let parseLanguagesJson languagesJson =
languagesJson
|> JsonValue.Parse // JsonValue
|> JsonExtensions.Properties // (string * JsonValue) []
|> Array.map fst // string []

Then we can leverage this function to get the languages of a user repo.

let languagesUrl userName repoName  =
sprintf "%s/repos/%s/%s/languages" basePath userName repoName

// string -> string -> Job <string []>
let getUserRepoLanguages username repoName =
languagesUrl username repoName
|> httpGet // Job<string>
|> Job.map parseLanguagesJson // // Job<string []>

Running this function in F# interactive will give us the following output

getUserRepoLanguages "tamizhvendan" "CafeApp" |> run;;
val it : string [] = [|"F#"; "JavaScript"; "Shell"; "HTML"; "Batchfile"|]

The Climax

Alright, now we have three individual functions that take care of fetching the different components.

val getUser : string -> Job<User>
val getTopThreeUserRepos : string -> Job<Repo []>
val getUserRepoLanguages : string -> string -> Job <string []>

The final piece is putting all these functions together and prepare the expected response. Before getting into that region, let's define some types to represent the last answer that we want and a helper function to construct it.

type RepoDto = {
Name : string
StargazersCount : int
Languages : string []
}

// Repo -> string [] -> RepoDto
let repoDto (repo : Repo) languages = {
Name = repo.Name
StargazersCount = repo.StargazersCount
Languages = languages
}
type UserDto = {
Name : string
AvatarUrl : string
TopThreeRepos : RepoDto []
}

To construct this response, we can call these functions sequentially inside a job computation expression and populate the UserDto from the return values.

But do we need to run them sequentially in the first place? We can run them in parallel as well.

We can run the jobs Job<User> & Job<Repo []> parallelly. And then for each Repo in the Repo [], we can get their respective languages job Job<string []> and run each of them parallelly. Finally, we can combine all of them and create UserDto.

Let's see it in action

open Hopac.Infixes

// string -> Repo -> Job<RepoDto>
let getRepoDto username (repo : Repo) =
getUserRepoLanguages username repo.Name
|> Job.map (repoDto repo)

// string -> Job<UserDto>
let getUserDto username = job {
let! user, repos =
getUser username <*> getTopThreeUserRepos username // <1>
let! repoDtos =
repos
|> Array.map (getRepoDto username) // <2>
|> Job.conCollect // <3>
return {
Name = user.Name
AvatarUrl = user.AvatarUrl
TopThreeRepos = repoDtos.ToArray()
}
}

<1> Uses the infix function <*> that we saw in the part-1 to run the given two jobs parallelly and returns their results as a Tuple.

<2> For each Repo in the repos array, we are getting their respective Job<RepoDto> using the getRepoDto function.

<3> Job.conCollect creates a job that runs all of the jobs as separate concurrent jobs and returns their results as ResizeArray.

val conCollect: seq<Job<'x>> -> Job<ResizeArray<'x>>

That's all.

Let's run this function in F# interactive and verify our implementation

getUserDto "tamizhvendan" |> run;;
val it : UserDto =
{Name = "Tamizhvendan S";
AvatarUrl = "https://avatars0.githubusercontent.com/u/1128916?v=4";
TopThreeRepos =
[|{Name = "blog-samples";
StargazersCount = 89;
Languages = [|"F#"; "HTML"; "Liquid"; "JavaScript"|];};
{Name = "CafeApp";
StargazersCount = 66;
Languages = [|"F#"; "JavaScript"; "Shell"; "HTML"; "Batchfile"|];};
{Name = "fsharp-phonecat";
StargazersCount = 12;
Languages = [|"JavaScript"; "F#"; "C#"; "CSS"; "Shell"; "ASP"|];}|];}

Hurrah, Everything is working as expected!

Adding Pagingation Support

As mentioned earlier, in this sub-section we are going to add the support for pagination in the get all public repositories API.

The pagination logic in GitHub APIs uses the Link header in the HTTP response, to communicate the navigation URLs.

A sample value of the Link header would look like

  <https://api.github.com/user/193115/repos?per_page=100&page=1>; rel="prev",
  <https://api.github.com/user/193115/repos?per_page=100&page=3>; rel="next",
  <https://api.github.com/user/193115/repos?per_page=100&page=3>; rel="last",
  <https://api.github.com/user/193115/repos?per_page=100&page=1>; rel="first"

To support pagination in our implementation, let's create a new type for representing the GitHub's response.

type GitHubResponse = {
Body : string
NextPageUrl : string option
}

Then we need to write a function to extract the next pagination URL from the value of the Link header

// string -> string option
let getNextPageUrl (linkText : string) =
linkText.Split([|','|])
|> Array.map (fun l -> l.Split([|';'|]))
|> Array.tryFind(fun l -> l.Length = 2 && l.[1].Contains("next"))
|> Option.map(fun l -> l.[0].Trim().TrimStart('<').TrimEnd('>'))

The next step is, from the HTTP response, we need to populate the GitHubResponse.

// Response -> Job<GitHubResponse>
let gitHubResponse response = job {
let! body = Response.readBodyAsString response
let nextPageUrl =
match response.headers.TryFind(ResponseHeader.Link) with
| Some linkText -> getNextPageUrl linkText
| None -> None
return {Body = body; NextPageUrl = nextPageUrl}
}

The current implementation of httpGet function just returns the body of the response. But to support pagination, we need a value of GitHubResponse. So, let's create a new function httpGetWithPagination to address this need.

// string -> Job<GitHubResponse>
let httpGetWithPagination url =
Request.createUrl Get url // Request
|> Request.setHeader (UserAgent "FsHopac") // Request
|> getResponse // Job<Response>
|> Job.bind gitHubResponse // Job<GitHubResponse>

We can modify the httpGet function to make use this function

let httpGet url =
- Request.createUrl Get url
- |> Request.setHeader (UserAgent "FsHopac")
- |> getResponse
- |> Job.bind Response.readBodyAsString
+ httpGetWithPagination url
+ |> Job.map (fun r -> r.Body)

Now we have the HTTP support in place for pagination. The next step is navigating through the pagination links and get all the user repos.

// string -> Job<Repo []>
let getAllUserRepos username =

// string -> Repo list -> Job<Repo list>
let rec getAllUserRepos' url (repos : Repo list) = job { // <1>
let! gitHubResponse = httpGetWithPagination url // GitHubResponse
let currentPageRepos =
gitHubResponse.Body // string
|> ReposTypeProvider.Parse // Repo []
|> Array.toList // Repo list
let reposSoFar = repos @ currentPageRepos // <2>
match gitHubResponse.NextPageUrl with
| Some nextPageUrl -> // <3>
return! getAllUserRepos' nextPageUrl reposSoFar // <4>
| None -> return reposSoFar // <5>
}

getAllUserRepos' (userReposUrl username) [] // Job<Repo list> // <6>
|> Job.map (List.toArray) // Job<Repo []>

<1> - The recursive function getAllUserRepos' implements the page navigation logic and the outer function getAllUserRepos calls this recursive one (at // <6>) with the initial URL and an empty list of Repo.

<2> - After getting the Repo list of the current page, the getAllUserRepos' function concatenates it with the repos from the initial list.

<3> - If the current page has the next page URL, we are calling the getAllUserRepos' function recursively with the next page's URL and repo lists populated so far (at // <4>).

<5> - If we reached the last page, we just return the repo lists populated so far

The final set of change is, calling this getAllUserRepos function from the getTopThreeUserRepos function

 let getTopThreeUserRepos username : Job<Repo []> =
- userReposUrl username
- |> httpGet
- |> Job.map ReposTypeProvider.Parse
+ getUserAllRepos username
|> Job.map topThreeUserRepos

Adding Log

If the run the current implementation in F# interactive, we just see the final output and we have no clue about what is happening behind the scenes.

To get this insight let's sprinkle some log statements.

let log msg x =
printfn "%s" msg
x
let httpGetWithPagination url =
Request.createUrl Get url
|> Request.setHeader (UserAgent "FsHopac")
+ |> log ("Request : " + url)
|> getResponse
|> Job.bind gitHubResponse
+ |> Job.map (log ("Response : " + url))

If we run the getTopThreeUserRepos function with the username (haf) of Henrik Feldt's, co-author of Suave, who is having 280 public repositories,

#time "on"
getUserDto "haf" |> run
#time "off"

We'll get the following output

--> Timing now on

Request : https://api.github.com/users/haf
Request : https://api.github.com/users/haf/repos?per_page=100
Response : https://api.github.com/users/haf
Response : https://api.github.com/users/haf/repos?per_page=100
Request : https://api.github.com/user/193115/repos?per_page=100&page=2
Response : https://api.github.com/user/193115/repos?per_page=100&page=2
Request : https://api.github.com/user/193115/repos?per_page=100&page=3
Response : https://api.github.com/user/193115/repos?per_page=100&page=3
Request : https://api.github.com/repos/haf/expecto/languages
Request : https://api.github.com/repos/haf/Http.fs/languages
Request : https://api.github.com/repos/haf/DotNetZip.Semverd/languages
Response : https://api.github.com/repos/haf/expecto/languages
Response : https://api.github.com/repos/haf/DotNetZip.Semverd/languages
Response : https://api.github.com/repos/haf/Http.fs/languages
Real: 00:00:04.179, CPU: 00:00:01.078, GC gen0: 3, gen1: 1
val it : UserDto =
{Name = "Henrik Feldt";
AvatarUrl = "https://avatars0.githubusercontent.com/u/193115?v=4";
TopThreeRepos =
[|{Name = "expecto";
StargazersCount = 220;
Languages = [|"F#"; "C#"; "Shell"; "Batchfile"|];};
{Name = "Http.fs";
StargazersCount = 197;
Languages = [|"HTML"; "F#"; "Ruby"; "Batchfile"; "Shell"|];};
{Name = "DotNetZip.Semverd";
StargazersCount = 178;
Languages =
[|"C#"; "HTML"; "Visual Basic"; "ASP"; "Smalltalk"; "JavaScript";
"Batchfile"; "Ruby"; "PowerShell"; "C++"; "Makefile"; "PHP"; "Perl"|];}|];}


--> Timing now off

From this log, we can infer that the requests to get the user and repo's first page are parallel, then the requests for navigating each page is sequential, and finally, requests for obtaining the languages of top three repos are parallel.

Exposing HTTP endpoint

The final piece of work that we need to do is exposing the getUserDto function via HTTP GET API.

Add Suave NuGet package and refer them in the script file

forge paket add Suave
// ...
#r "packages/Suave/lib/net40/Suave.dll"

// ...
open Suave
open Suave.Successful
open Suave.Operators
open System.Threading

// ...

To serialise UserDto and RepoDto to JSON, let's add ToJson function which transforms the DTOs to FSharp.Data's JsonValue type.

type RepoDto = {
// ...
} with
static member ToJson(r : RepoDto) =
let languages =
r.Languages
|> Array.map (JsonValue.String)
|> JsonValue.Array
let stars =
r.StargazersCount |> decimal |> JsonValue.Number
JsonValue.Record [|
"name", JsonValue.String r.Name
"stars", stars
"languages", languages
|]
type UserDto = {
// ...
} with
static member ToJson(u : UserDto) =
let topThreeRepos =
u.TopThreeRepos
|> Array.map RepoDto.ToJson
|> JsonValue.Array
JsonValue.Record [|
"name", JsonValue.String u.Name
"avatarUrl", JsonValue.String u.AvatarUrl
"topThreeRepos", topThreeRepos
|]
static member ToJsonString(u : UserDto) =
UserDto.ToJson(u).ToString()

The ToJsonString function returns the string representation of the UserDto's JsonValue.

Then write the getUserApi function that calls the getUserDto function and transforms the return value to the corresponding HTTP response.

open Suave.Operators // <1>

// string -> HttpContext -> Async<HttpContext option>
let getUserApi username ctx = async {
let! userDtoResponse =
getUserDto username // Job<UserDto>
|> Job.catch // Job<Choice<UserDto, exn>> // <2>
|> Job.toAsync // Async<Choice<UserDto, exn>> // <3>
match userDtoResponse with
| Choice1Of2 userDto ->
let res =
userDto
|> UserDto.ToJsonString
|> OK
>=> Writers.setMimeType "application/json; charset=utf-8"
return! res ctx
| Choice2Of2 ex ->
printfn "%A" ex
return! ServerErrors.INTERNAL_ERROR "something went wrong" ctx
}

<1> Opens the module Suave.Operators here instead of at the beginning of the file as >=> symbol is also defined in Hopac.Infixes module

<2> The Job.catch function creates a job that runs the given job and results in either the result of the job or the exception raised by the job.

<3> The Job.toAsync function creates an async operation that starts the given job and waits for it to complete

Finally, wire up the getUserApi function with a path and start the Suave server.

let app = pathScan "/api/profile/%s" getUserApi

let startServer () =
let cts = new CancellationTokenSource()
let listening, server =
startWebServerAsync defaultConfig app
Async.Start(server, cts.Token) |> ignore
Async.RunSynchronously listening |> ignore
cts

let stopServer (cts : CancellationTokenSource) =
cts.Cancel true
cts.Dispose()
> let cts = startServer ();;
[09:45:15 INF] Smooth! Suave listener started in 1.712 with binding 127.0.0.1:8080
val cts : CancellationTokenSource

We can then verify the API via Curl.

Awesome!!

Make sure that we stop the server using the stopServer function (as we are using F# Script to run the HTTP server).

> stopServer cts;;
val it : unit = ()

Summary

Parallelism is merely running things in parallel. Concurrency is a way to structure your program. - Rob Pike

It is precisely what we did in this blog post. We structured the execution of Jobs declaratively and get the job (pun intended) done with the help of Hopac.

We have also learned the Job.bind, Job.map, Job.catch, Job.toAsync & Job.conCollect functions from the Hopac library on the way. The source code of this blog post is available on GitHub


Did the content capture your interest? Stay in the loop by subscribing to the RSS feed and staying informed!