问题
I have the following code which grabs two pages of data from a paginated API endpoint. I'd like to modify query
function to keep getting pages until it finds no more data (so replace take 2
in the code below with something which looks at the API response).
My question is wether it is possible to achieve this without changing query
function to an IO
function. And if so, how would I go about it. If not, is there a way of doing this without writing recursive function?
Here is the code:
#!/usr/bin/env stack
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeOperators #-}
import Servant.Client
import Network.HTTP.Client (newManager, defaultManagerSettings)
import Data.Proxy
import Servant.API
import Data.Aeson
import GHC.Generics
-- data type
data BlogPost = BlogPost
{ id :: Integer
, title :: String
} deriving (Show, Generic)
instance FromJSON BlogPost
-- api client
type API = "posts" :> QueryParam "_page" Integer :> Get '[JSON] [BlogPost]
api :: Proxy API
api = Proxy
posts :: Maybe Integer -> ClientM [BlogPost]
posts = client api
-- query by page
query :: ClientM [[BlogPost]]
query = sequence $ take 2 $ map posts pages
where
pages = [Just p | p <- [1..]]
-- main
main :: IO ()
main = do
manager' <- newManager defaultManagerSettings
let url = ClientEnv manager' (BaseUrl Http "jsonplaceholder.typicode.com" 80 "")
posts' <- runClientM query url
print posts'
I've tried to use takeWhileM
to do this and ended up making query an IO
function and passing url
into it. It was starting to look pretty horrible and I couldn't get the types to match up (I felt like I needed something more like (a -> m Bool) -> m [a] -> m [a]
rather than (a -> m Bool) -> [a] -> m [a]
which is what takeWhileM
is - still find this strange because I see this function as a filter, yet the input list and output list are different (one has monad around it and the other doesn't)).
回答1:
For these cases of monadic iteration I usually turn to the streaming library. Its interface is reminiscent to that of pure lists, while still allowing effects:
import Streaming
import qualified Streaming.Prelude as S
repeatAndCollect :: Monad m => m (Either a r) -> m [a]
repeatAndCollect = S.toList_ . Control.Monad.void . S.untilRight
repeatAndCollectLimited :: Monad m => Int -> m (Either a r) -> m [a]
repeatAndCollectLimited len = S.toList_ . S.take len . S.untilRight
Using the untilRight, take and toList_ functions.
When only the first successful result is needed, we can use the Alternative
instance of the ExceptT transformer in combination with asum from Data.Foldable
to execute a list of fallible actions until one of them succeeds.
IO
itself has an Alternative
instance that returns the first "success", where "failure" means throwing a IOException
.
回答2:
Have you tried unfoldM
?
unfoldM :: Monad m => m (Maybe a) -> m [a]
Let's update posts
this way
posts :: Maybe Integer -> ClientM (Maybe [BlogPost])
posts = fmap notNil . client api where
notNil [] = Nothing
notNil bs = Just bs
The idea is to update query
so that you can just use unfoldM query
and get back an ClientM [[BlogPost]]
. To do that, the type of query
has to be
query :: ClientM (Maybe [BlogPost])
meaning, the page number must be coming from the environment:
query = forever $ page >>= posts
Clearly, there is some form of state going on here, as we need a way to keep track of the current page number. We can wrap the client action in a StateT
:
type ClientSM = StateT Integer ClientM
page :: ClientSM Integer
page = get <* modify (+1)
This action demands a few additional changes to both query
and posts
. Edit: see below for a stroke of insight I got in the bus. First we need to lift the client action in the state monad:
posts :: Integer -> ClientSM (Maybe [BlogPost])
posts = fmap notNil . lift . client api . Just where
notNil [] = Nothing
notNil xs = Just xs
Only the type of query
needs changing
query :: ClientSM (Maybe [BlogPost])
Finally, the main action just needs to peel the monad stack and unfold the query:
main = do
manager' <- newManager defaultManagerSettings
let url = mkClientEnv manager' (BaseUrl Http "jsonplaceholder.typicode.com" 80 "")
result <- flip runClientM url $ flip runStateT 1 $ unfoldM query
case result of
Left error -> print error
Right (posts, _) -> print posts
I haven't tested this, but it compiles 😅🤗
posts
is oblivious to the state, and should remain so. So, without changing my original version above, you just need to lift in query
:
query :: ClientSM (Maybe [BlogPost])
query = forever $ page >>= lift . posts . Just
回答3:
If you need to keep the ClientM
objects separate (either to run them each in a clean state, or anything similar), the best way is to chain your operations together.
In this particular case, the runClientM query ...
IO action returns a Either String [BlogPost]
. This means that the stop condition is receiving a Left String
from one of the computations.
Using a hand-crafted eitherM
helper, which runs one of two actions depending on the Either
contructor, here is a relatively simple example of that:
Using the good old either makes this relatively simple :
queryAll :: ClientEnv -> [Int] -> IO [[BlogPost]]
queryAll _ [] = return []
queryAll url (x:xs) = runClientM (posts x) url >>= either ((const.pure) []) (\b -> (b:) <$> queryAll url xs)
main :: IO ()
main = do
manager' <- newManager defaultManagerSettings
let url = ClientEnv manager' (BaseUrl Http "jsonplaceholder.typicode.com" 80 "")
posts' <- queryAll url [1..]
print posts'
Hope it can help! :)
来源:https://stackoverflow.com/questions/50178815/can-io-actions-be-sequenced-while-keeping-the-logic-in-a-pure-function