0

Merge branch 'cjp/contributors-guide-and' of https://github.com/mendableai/firecrawl into cjp/contributors-guide-and

This commit is contained in:
Nicolas 2024-04-21 11:27:37 -07:00
commit 749bd5f44d
2 changed files with 113 additions and 19 deletions

View File

@ -1,8 +1,96 @@
# Contributing
We love contributions! Our contribution guide will be coming soon! # Contributors guide:
<!-- Please read our [contributing guide](CONTRIBUTING.md) before submitting a pull request. --> Welcome to firecrawl 🔥! Here are some instructions on how to get the project locally, so you can run it on your own (and contribute)
If you're contributing, note that the process is similar to other open source repos i.e. (fork firecrawl, make changes, run tests, PR). If you have any questions, and would like help gettin on board, reach out to hello@mendable.ai for more or submit an issue!
## Hosting locally
First, start by installing dependencies
1. node.js [instructions](https://nodejs.org/en/learn/getting-started/how-to-install-nodejs)
2. pnpm [instructions](https://pnpm.io/installation)
3. redis - [instructions](https://redis.io/docs/latest/operate/oss_and_stack/install/install-redis/)
Set environment variables in a .env in the /apps/api/ directoryyou can copy over the template in .env.example.
To start, we wont set up authentication, or any optional sub services (pdf parsing, JS blocking support, AI features )
```.env
# ===== Required ENVS ======
NUM_WORKERS_PER_QUEUE=8
PORT=3002
HOST=0.0.0.0
REDIS_URL=redis://localhost:6379
## To turn on DB authentication, you need to set up supabase.
USE_DB_AUTHENTICATION=false
# ===== Optional ENVS ======
# Supabase Setup (used to support DB authentication, advanced logging, etc.)
SUPABASE_ANON_TOKEN=
SUPABASE_URL=
SUPABASE_SERVICE_TOKEN=
# Other Optionals
TEST_API_KEY= # use if you've set up authentication and want to test with a real API key
SCRAPING_BEE_API_KEY= #Set if you'd like to use scraping Be to handle JS blocking
OPENAI_API_KEY= # add for LLM dependednt features (image alt generation, etc.)
BULL_AUTH_KEY= #
LOGTAIL_KEY= # Use if you're configuring basic logging with logtail
PLAYWRIGHT_MICROSERVICE_URL= # set if you'd like to run a playwright fallback
LLAMAPARSE_API_KEY= #Set if you have a llamaparse key you'd like to use to parse pdfs
```
You're going to need to open 3 terminals.
### Terminal 1 - setting up redis
Run the command anywhere within your project
`redis-server`
### Terminal 2 - setting up workers
Now, navigate to the apps/api/ directory and run:
`pnpm run workers`
### Terminal 3 - setting up the main server
To do this, navigate to the apps/api/ directory and run if you dont have this already, install pnpm here: https://pnpm.io/installation
Next, run your server with`pnpm run start`
### Terminal 3 - sending our first request.
Alright: now lets send our first request.
```curl
curl -X GET http://localhost:3002/test
```
This should return the response Hello, world!
If youd like to test the crawl endpoint, you can run this
```curl
curl -X POST http://localhost:3002/v0/crawl \
-H 'Content-Type: application/json' \
-d '{
"url": "https://mendable.ai"
}'
```
## Tests:
The best way to do this is run the test with npx:Once again, navigate to the `apps/api` directory`npx jest`

View File

@ -1,18 +1,24 @@
ENV= # ===== Required ENVS ======
NUM_WORKERS_PER_QUEUE=8 NUM_WORKERS_PER_QUEUE=8
PORT= PORT=3002
HOST= HOST=0.0.0.0
SUPABASE_ANON_TOKEN= REDIS_URL=redis://localhost:6379
SUPABASE_URL=
## To turn on DB authentication, you need to set up supabase.
USE_DB_AUTHENTICATION=true
# ===== Optional ENVS ======
# Supabase Setup (used to support DB authentication, advanced logging, etc.)
SUPABASE_ANON_TOKEN=
SUPABASE_URL=
SUPABASE_SERVICE_TOKEN= SUPABASE_SERVICE_TOKEN=
REDIS_URL=
# Optional # Other Optionals
TEST_API_KEY= # use if you've set up authentication and want to test with a real API key
SCRAPING_BEE_API_KEY= SCRAPING_BEE_API_KEY= #Set if you'd like to use scraping Be to handle JS blocking
OPENAI_API_KEY= OPENAI_API_KEY= # add for LLM dependednt features (image alt generation, etc.)
BULL_AUTH_KEY= BULL_AUTH_KEY= #
LOGTAIL_KEY= LOGTAIL_KEY= # Use if you're configuring basic logging with logtail
PLAYWRIGHT_MICROSERVICE_URL= PLAYWRIGHT_MICROSERVICE_URL= # set if you'd like to run a playwright fallback
LLAMAPARSE_API_KEY= LLAMAPARSE_API_KEY= #Set if you have a llamaparse key you'd like to use to parse pdfs
TEST_API_KEY=