Skip to content

Commit

Permalink
feat: update template
Browse files Browse the repository at this point in the history
  • Loading branch information
FlorentLvr committed Dec 5, 2023
1 parent 70d6716 commit 5537c6e
Show file tree
Hide file tree
Showing 2 changed files with 50 additions and 22 deletions.
61 changes: 48 additions & 13 deletions RegEx/RegEx_Remove_emojis_from_text.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@
"tags": []
},
"source": [
"**Last update:** 2023-12-04 (Created: 2023-12-04)"
"**Last update:** 2023-12-05 (Created: 2023-12-05)"
]
},
{
Expand All @@ -63,7 +63,7 @@
"tags": []
},
"source": [
"**Description:** This notebook will show how to remove emojis from a text using RegEx and Python. It is usefull for organizations that need to clean text from emojis."
"**Description:** This notebook will show how to remove emojis from a text using RegEx and Python."
]
},
{
Expand All @@ -74,7 +74,9 @@
"tags": []
},
"source": [
"**References:**\n- [Regular Expressions - Python Documentation](https://docs.python.org/3/library/re.html)\n- [Remove Emojis from Text - Stack Overflow](https://stackoverflow.com/questions/33404752/removing-emojis-from-a-string-in-python)"
"**References:**\n",
"- [Regular Expressions - Python Documentation](https://docs.python.org/3/library/re.html)\n",
"- [Remove Emojis from Text - Stack Overflow](https://stackoverflow.com/questions/33404752/removing-emojis-from-a-string-in-python)"
]
},
{
Expand Down Expand Up @@ -107,8 +109,10 @@
"papermill": {},
"tags": []
},
"source": "import re",
"outputs": []
"outputs": [],
"source": [
"import re"
]
},
{
"cell_type": "markdown",
Expand All @@ -118,7 +122,8 @@
"tags": []
},
"source": [
"### Setup variables\n- `text`: Text containing emojis"
"### Setup variables\n",
"- `text`: Text containing emojis"
]
},
{
Expand All @@ -129,8 +134,10 @@
"papermill": {},
"tags": []
},
"source": "text = \"This is a text with emojis \ud83d\ude0a\ud83d\ude0a\ud83d\ude0a\"",
"outputs": []
"outputs": [],
"source": [
"text = \"This is a text with emojis 😊😊😊\""
]
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -173,8 +180,34 @@
"papermill": {},
"tags": []
},
"source": "def remove_emojis(text):\n return re.sub(r\"[^\\w\\s]\", \"\", text)",
"outputs": []
"outputs": [],
"source": [
"def remove_emojis(text):\n",
" # Emoji pattern\n",
" emoji_pattern = re.compile(\"[\"\n",
" u\"\\U0001F600-\\U0001F64F\" # emoticons\n",
" u\"\\U0001F300-\\U0001F5FF\" # symbols & pictographs\n",
" u\"\\U0001F680-\\U0001F6FF\" # transport & map symbols\n",
" u\"\\U0001F1E0-\\U0001F1FF\" # flags (iOS)\n",
" u\"\\U00002500-\\U00002BEF\" # chinese char\n",
" u\"\\U00002702-\\U000027B0\"\n",
" u\"\\U00002702-\\U000027B0\"\n",
" u\"\\U000024C2-\\U0001F251\"\n",
" u\"\\U0001f926-\\U0001f937\"\n",
" u\"\\U00010000-\\U0010ffff\"\n",
" u\"\\u2640-\\u2642\"\n",
" u\"\\u2600-\\u2B55\"\n",
" u\"\\u200d\"\n",
" u\"\\u23cf\"\n",
" u\"\\u23e9\"\n",
" u\"\\u231a\"\n",
" u\"\\ufe0f\" # dingbats\n",
" u\"\\u3030\"\n",
" \"]+\", flags=re.UNICODE)\n",
" # Remove emojis from the text\n",
" text = emoji_pattern.sub(r'', text)\n",
" return text.strip()"
]
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -206,8 +239,10 @@
"papermill": {},
"tags": []
},
"source": "print(remove_emojis(text))",
"outputs": []
"outputs": [],
"source": [
"print(remove_emojis(text))"
]
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -249,4 +284,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}
11 changes: 2 additions & 9 deletions template.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,13 @@
"cells": [
{
"cell_type": "markdown",
"id": "latin-packing",
"id": "88c104cc-bf08-4242-821b-b3a40908152a",
"metadata": {
"execution": {
"iopub.execute_input": "2021-02-23T14:22:16.610471Z",
"iopub.status.busy": "2021-02-23T14:22:16.610129Z",
"iopub.status.idle": "2021-02-23T14:22:16.627784Z",
"shell.execute_reply": "2021-02-23T14:22:16.626866Z",
"shell.execute_reply.started": "2021-02-23T14:22:16.610384Z"
},
"papermill": {},
"tags": []
},
"source": [
"<img width=\"10%\" alt=\"Naas\" src=\"https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160\"/>"
"<img width=\"8%\" alt=\"Naas.png\" src=\"https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/.github/assets/logos/Naas.png\" style=\"border-radius: 15%\">"
]
},
{
Expand Down

0 comments on commit 5537c6e

Please sign in to comment.