keras-deep-learning

Build and debug deep learning models with Keras and TensorFlow backend

191 stars

Best use case

keras-deep-learning is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Build and debug deep learning models with Keras and TensorFlow backend

Teams using keras-deep-learning should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/keras-deep-learning/SKILL.md --create-dirs "https://raw.githubusercontent.com/wentorai/research-plugins/main/skills/domains/ai-ml/keras-deep-learning/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/keras-deep-learning/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How keras-deep-learning Compares

Feature / Agentkeras-deep-learningStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Build and debug deep learning models with Keras and TensorFlow backend

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Keras Deep Learning Guide

## Overview

Keras is the high-level deep learning API that ships as part of TensorFlow 2.x and is the recommended interface for building, training, and deploying neural networks. Its Sequential and Functional APIs provide a progressive disclosure of complexity: beginners can stack layers in minutes, while researchers can build arbitrary DAG architectures, custom training loops, and multi-output models with the same framework.

This guide covers practical patterns for academic research with Keras, from image classification and sequence modeling to custom loss functions and experiment reproducibility. The focus is on patterns that appear repeatedly in published work -- data loading pipelines, callback orchestration, hyperparameter search, and model introspection -- rather than toy examples.

Keras is particularly strong in rapid prototyping for research papers. Its integration with TensorBoard, Weights & Biases, and tf.data pipelines makes it straightforward to go from idea to reproducible experiment to publication-quality results.

## Model Architecture Patterns

### Sequential API for Standard Architectures

```python
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Image classification baseline
model = keras.Sequential([
    layers.Input(shape=(224, 224, 3)),
    layers.Rescaling(1.0 / 255),
    layers.Conv2D(32, 3, activation="relu", padding="same"),
    layers.BatchNormalization(),
    layers.MaxPooling2D(2),
    layers.Conv2D(64, 3, activation="relu", padding="same"),
    layers.BatchNormalization(),
    layers.MaxPooling2D(2),
    layers.Conv2D(128, 3, activation="relu", padding="same"),
    layers.GlobalAveragePooling2D(),
    layers.Dropout(0.3),
    layers.Dense(256, activation="relu"),
    layers.Dense(10, activation="softmax"),
])

model.compile(
    optimizer=keras.optimizers.AdamW(learning_rate=1e-3, weight_decay=1e-4),
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"],
)
```

### Functional API for Multi-Input/Multi-Output Models

```python
# Multi-input model for multimodal research
image_input = keras.Input(shape=(224, 224, 3), name="image")
text_input = keras.Input(shape=(128,), dtype="int32", name="text")

# Image branch
x_img = keras.applications.EfficientNetV2B0(
    include_top=False, weights="imagenet", input_tensor=image_input
).output
x_img = layers.GlobalAveragePooling2D()(x_img)

# Text branch
x_txt = layers.Embedding(10000, 128)(text_input)
x_txt = layers.Bidirectional(layers.LSTM(64))(x_txt)

# Merge
merged = layers.Concatenate()([x_img, x_txt])
merged = layers.Dense(256, activation="relu")(merged)
merged = layers.Dropout(0.4)(merged)
output = layers.Dense(5, activation="softmax", name="classification")(merged)

model = keras.Model(inputs=[image_input, text_input], outputs=output)
```

## Data Pipeline with tf.data

Efficient data loading is critical for GPU utilization in research experiments:

```python
def build_dataset(file_pattern, batch_size=32, training=True):
    """Build a tf.data pipeline with augmentation for research experiments."""
    dataset = tf.data.Dataset.list_files(file_pattern, shuffle=training)

    def parse_image(path):
        img = tf.io.read_file(path)
        img = tf.image.decode_jpeg(img, channels=3)
        img = tf.image.resize(img, [256, 256])
        label = tf.strings.split(path, os.sep)[-2]
        return img, label

    dataset = dataset.map(parse_image, num_parallel_calls=tf.data.AUTOTUNE)

    if training:
        dataset = dataset.shuffle(1000)
        dataset = dataset.map(
            lambda x, y: (tf.image.random_flip_left_right(x), y),
            num_parallel_calls=tf.data.AUTOTUNE,
        )

    dataset = dataset.batch(batch_size)
    dataset = dataset.prefetch(tf.data.AUTOTUNE)
    return dataset
```

## Training and Callback Orchestration

### Reproducible Training Setup

```python
import os
import random
import numpy as np

def set_seed(seed=42):
    """Ensure reproducibility across runs for paper results."""
    os.environ["PYTHONHASHSEED"] = str(seed)
    random.seed(seed)
    np.random.seed(seed)
    tf.random.set_seed(seed)

set_seed(42)

callbacks = [
    keras.callbacks.ModelCheckpoint(
        "best_model.keras", monitor="val_loss", save_best_only=True
    ),
    keras.callbacks.EarlyStopping(
        monitor="val_loss", patience=10, restore_best_weights=True
    ),
    keras.callbacks.ReduceLROnPlateau(
        monitor="val_loss", factor=0.5, patience=5, min_lr=1e-6
    ),
    keras.callbacks.TensorBoard(log_dir="./logs", histogram_freq=1),
    keras.callbacks.CSVLogger("training_log.csv"),
]

history = model.fit(
    train_dataset,
    validation_data=val_dataset,
    epochs=100,
    callbacks=callbacks,
)
```

### Custom Training Loop for Research

```python
@tf.function
def train_step(model, optimizer, x, y, loss_fn):
    with tf.GradientTape() as tape:
        predictions = model(x, training=True)
        loss = loss_fn(y, predictions)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    return loss

# Custom metric tracking
train_loss = keras.metrics.Mean(name="train_loss")
for epoch in range(num_epochs):
    train_loss.reset_state()
    for x_batch, y_batch in train_dataset:
        loss = train_step(model, optimizer, x_batch, y_batch, loss_fn)
        train_loss.update_state(loss)
    print(f"Epoch {epoch+1}, Loss: {train_loss.result():.4f}")
```

## Debugging and Common Pitfalls

| Issue | Symptom | Solution |
|-------|---------|----------|
| Exploding gradients | Loss becomes NaN | Add gradient clipping, reduce learning rate |
| Overfitting | Val loss diverges from train loss | Add Dropout, data augmentation, weight decay |
| Underfitting | Both losses plateau high | Increase model capacity, reduce regularization |
| Slow training | Low GPU utilization | Use tf.data with prefetch, increase batch size |
| Memory errors | OOM on GPU | Reduce batch size, use mixed precision |
| Non-deterministic results | Different results per run | Call `set_seed()`, set `TF_DETERMINISTIC_OPS=1` |

### Mixed Precision Training

```python
# Enable mixed precision for 2x speedup on modern GPUs
keras.mixed_precision.set_global_policy("mixed_float16")

# Ensure the output layer uses float32 for numerical stability
output = layers.Dense(10, activation="softmax", dtype="float32")(x)
```

## Best Practices for Research

- **Version pin everything.** Record `tensorflow`, `keras`, `numpy`, and `cuda` versions in your paper appendix.
- **Use `keras.utils.set_random_seed(42)`** for full determinism (TF 2.12+).
- **Save models in `.keras` format** (not HDF5) for forward compatibility.
- **Profile with TensorBoard** to identify data pipeline bottlenecks before scaling up.
- **Use `tf.debugging.enable_check_numerics()`** during development to catch NaN/Inf early.
- **Export with `tf.saved_model`** for deployment; export ONNX for cross-framework comparison.

## References

- [Deep Learning with Python, 2nd Edition](https://www.manning.com/books/deep-learning-with-python-second-edition) -- Francois Chollet (Keras creator)
- [Keras documentation](https://keras.io/) -- Official API reference and guides
- [TensorFlow tutorials](https://www.tensorflow.org/tutorials) -- End-to-end examples
- [fchollet/deep-learning-with-python-notebooks](https://github.com/fchollet/deep-learning-with-python-notebooks) -- Code companion to the book
- [Keras examples gallery](https://keras.io/examples/) -- 100+ community-contributed examples

Related Skills

tongyi-deep-research-guide

191
from wentorai/research-plugins

Open-source deep research agent by Alibaba for scholarly research

local-deep-research-guide

191
from wentorai/research-plugins

Deep research agent searching 10+ sources with local or cloud LLMs

deep-searcher-guide

191
from wentorai/research-plugins

Open deep research alternative for private data with vector search

auto-deep-research-guide

191
from wentorai/research-plugins

Automated deep research tool for thorough topic investigation

deep-research-skills

191
from wentorai/research-plugins

13 deep research & systematic reviews skills. Trigger: systematic reviews, multi-source synthesis, comprehensive literature surveys. Design: multi-step research protocols with quality assessment and evidence grading.

deep-literature-search

191
from wentorai/research-plugins

Multi-source exhaustive literature search across academic databases

learning-science-guide

191
from wentorai/research-plugins

Evidence-based learning science principles for educational research and practice

reinforcement-learning-guide

191
from wentorai/research-plugins

Reinforcement learning fundamentals, algorithms, and research

graph-learning-papers-guide

191
from wentorai/research-plugins

Conference papers on graph neural networks and graph learning

deep-learning-papers-guide

191
from wentorai/research-plugins

Annotated deep learning paper implementations with code walkthroughs

thuthesis-guide

191
from wentorai/research-plugins

Write Tsinghua University theses using the ThuThesis LaTeX template

thesis-writing-guide

191
from wentorai/research-plugins

Templates, formatting rules, and strategies for thesis and dissertation writing