DSA Fundamentals: Arrays & Strings - From Theory to LeetCode Practice

Jay Kye
12 min read

Master arrays and strings data structures through theory and practical LeetCode problem solving. Covering static vs dynamic arrays, string immutability, time complexity analysis, and essential algorithms.

DSAArraysStringsLeetCodeAlgorithmsData StructuresProblem Solving

Data Structures and Algorithms form the foundation of efficient programming. Today, we'll dive deep into Arrays and Strings - two fundamental data structures that appear in virtually every coding interview and real-world application.

This comprehensive guide combines theoretical understanding with practical problem-solving, featuring solutions to essential LeetCode problems that demonstrate core concepts.

Understanding Arrays: Static vs Dynamic

Array Fundamentals

Arrays are mutable (changeable) collections that store elements in contiguous memory locations, providing efficient random access.

# Array mutability example
A = [1, 2, 3, 4, 5]
A[3] = 1  # Modify element at index 3
print(A)  # Output: [1, 2, 3, 1, 5]

Static Arrays vs Dynamic Arrays

Static Arrays:

  • Fixed size at creation time
  • To change size: copy all elements → create new array → paste elements
  • Memory allocated at compile time

Dynamic Arrays:

  • Size can grow/shrink during runtime
  • Programming language automatically handles expansion
  • Memory allocated at runtime (Python lists, JavaScript arrays)

Array Operations Time Complexity

OperationTime ComplexityNotes
Random AccessO(1)Direct index access
Search (in operator)O(n)Linear scan required
Append to endO(1)**Amortized average
Pop from endO(1)Direct removal
Insert (not at end)O(n)Shift elements required
Delete (not at end)O(n)Shift elements required
Modify elementO(1)Direct access and change

Understanding Strings: Immutability Matters

Strings are immutable - once created, they cannot be changed. Any "modification" creates a new string object.

String vs Array Operations Comparison

OperationArray/ListString (Immutable)
Appending to endO(1)* AmortizedO(n)
Popping from endO(1)O(n)
Insertion (not from end)O(n)O(n)
Deletion (not from end)O(n)O(n)
Modifying an elementO(1)O(n)
Random AccessO(1)O(1)
SearchingO(n)O(n)

Key Insight: String immutability means every modification operation creates a new string, leading to O(n) complexity even for simple operations.

Essential LeetCode Problems & Solutions

Let's explore fundamental array and string problems that demonstrate core algorithmic patterns.

1. Valid Sudoku (LeetCode 36)

Problem: Determine if a 9x9 Sudoku board is valid.

Approach: Use hash sets to track seen numbers in rows, columns, and 3x3 boxes.

def isValidSudoku(board):
    # Create sets for rows, columns, and 3x3 boxes
    rows = [set() for _ in range(9)]
    cols = [set() for _ in range(9)]
    boxes = [set() for _ in range(9)]
    
    for row in range(9):
        for col in range(9):
            current_val = board[row][col]
            
            # Skip empty cells
            if current_val == '.':
                continue
            
            # Calculate box index: (row//3) * 3 + (col//3)
            box_index = (row // 3) * 3 + (col // 3)
            
            # Check if number already exists
            if (current_val in rows[row] or 
                current_val in cols[col] or 
                current_val in boxes[box_index]):
                return False
            
            # Add to respective sets
            rows[row].add(current_val)
            cols[col].add(current_val)
            boxes[box_index].add(current_val)
    
    return True

Time Complexity: O(1) - Fixed 9x9 grid = 81 operations Space Complexity: O(1) - Fixed number of sets

Key Concepts:

  • Hash Set Usage: Efficient O(1) lookup for duplicates
  • Box Index Calculation: (row//3) * 3 + (col//3) maps 2D coordinates to box number
  • Early Termination: Return False immediately when duplicate found

2. Longest Consecutive Sequence (LeetCode 128)

Problem: Find the length of the longest consecutive elements sequence in O(n) time.

Approach: Use hash set to identify sequence starts and extend sequences.

def longestConsecutive(nums):
    if not nums:
        return 0
    
    # Convert to set for O(1) lookup and remove duplicates
    num_set = set(nums)
    longest = 0
    
    for num in num_set:
        # Check if this can be a sequence start
        if num - 1 not in num_set:
            current_num = num
            current_length = 1
            
            # Extend the sequence
            while current_num + 1 in num_set:
                current_num += 1
                current_length += 1
            
            # Update longest sequence found
            longest = max(longest, current_length)
    
    return longest

Time Complexity: O(n) - Each number visited at most twice Space Complexity: O(n) - Hash set storage

Key Concepts:

  • Sequence Start Identification: Only start counting when num - 1 not in set
  • Hash Set Optimization: O(1) lookup prevents nested loops
  • Duplicate Handling: Set automatically removes duplicates

3. Product of Array Except Self (LeetCode 238)

Problem: Return array where each element is the product of all elements except itself (no division allowed).

Approach: Two-pass algorithm - left products, then right products.

def productExceptSelf(nums):
    n = len(nums)
    result = [1] * n
    
    # First pass: accumulate products to the left
    for i in range(1, n):
        result[i] = result[i-1] * nums[i-1]
    
    # Second pass: multiply by products to the right
    right_product = 1
    for i in range(n-1, -1, -1):
        result[i] *= right_product
        right_product *= nums[i]
    
    return result

# Example walkthrough:
# nums = [1, 2, 3, 4]
# After first pass:  [1, 1, 2, 6]  (left products)
# After second pass: [24, 12, 8, 6] (left * right products)

Time Complexity: O(n) - Two passes through array Space Complexity: O(1) - Only using result array (required output)

Key Concepts:

  • Two-Pass Strategy: Separate left and right product calculations
  • Space Optimization: Use result array to store intermediate values
  • No Division Constraint: Handles edge cases like zeros naturally

4. Top K Frequent Elements (LeetCode 347)

Problem: Return k most frequent elements from an array.

Approach 1: Counter + Sorting

from collections import Counter

def topKFrequent(nums, k):
    # Count frequencies
    counter = Counter(nums)
    
    # Sort by frequency (descending)
    sorted_items = sorted(counter.items(), key=lambda x: x[1], reverse=True)
    
    # Extract first k elements
    result = []
    for i in range(k):
        result.append(sorted_items[i][0])
    
    return result

Time Complexity: O(n log n) - Sorting dominates Space Complexity: O(n) - Counter storage

Approach 2: Counter + Bucket Sort (Optimal)

from collections import Counter

def topKFrequent(nums, k):
    counter = Counter(nums)
    n = len(nums)
    
    # Create buckets for each possible frequency (0 to n)
    buckets = [[] for _ in range(n + 1)]
    
    # Place numbers in buckets based on frequency
    for num, freq in counter.items():
        buckets[freq].append(num)
    
    # Collect k most frequent elements (iterate backwards)
    result = []
    for freq in range(n, -1, -1):
        for num in buckets[freq]:
            result.append(num)
            if len(result) == k:
                return result
    
    return result

Time Complexity: O(n) - Linear bucket sort Space Complexity: O(n) - Bucket array

Key Concepts:

  • Bucket Sort Application: Frequency as bucket index
  • Trade-off Analysis: Sorting vs bucket sort based on constraints
  • Early Termination: Stop when k elements found

5. Group Anagrams (LeetCode 49)

Problem: Group strings that are anagrams of each other.

Approach 1: Sorting as Key

from collections import defaultdict

def groupAnagrams(strs):
    groups = defaultdict(list)
    
    for s in strs:
        # Use sorted string as key
        key = ''.join(sorted(s))
        groups[key].append(s)
    
    return list(groups.values())

Time Complexity: O(n * m log m) where n = number of strings, m = average string length Space Complexity: O(n * m) - Storage for groups

Approach 2: Character Count as Key (Optimal)

from collections import defaultdict

def groupAnagrams(strs):
    groups = defaultdict(list)
    
    for s in strs:
        # Create character count array
        char_count = [0] * 26
        for char in s:
            char_count[ord(char) - ord('a')] += 1
        
        # Use tuple of counts as key (arrays aren't hashable)
        key = tuple(char_count)
        groups[key].append(s)
    
    return list(groups.values())

Time Complexity: O(n * m) - Linear in string length Space Complexity: O(n * m) - Storage for groups

Key Concepts:

  • Anagram Property: Same character frequencies
  • Key Generation: Sorted string vs character count
  • Hashable Keys: Tuple conversion for dictionary keys

6. Two Sum (LeetCode 1)

Problem: Find two numbers that add up to target.

Approach: Hash map for complement lookup.

def twoSum(nums, target):
    seen = {}  # value -> index mapping
    
    for i, num in enumerate(nums):
        complement = target - num
        
        if complement in seen:
            return [seen[complement], i]
        
        seen[num] = i
    
    return []  # No solution found

Time Complexity: O(n) - Single pass Space Complexity: O(n) - Hash map storage

Key Concepts:

  • Complement Strategy: target - current = needed value
  • Hash Map Optimization: O(1) lookup vs O(n) nested loops
  • Index Tracking: Store positions for result

7. Valid Anagram (LeetCode 242)

Problem: Determine if two strings are anagrams.

Approach 1: Sorting

def isAnagram(s, t):
    return sorted(s) == sorted(t)

Time Complexity: O(n log n) - Sorting Space Complexity: O(1) - Ignoring sort space

Approach 2: Character Frequency (Optimal)

from collections import Counter

def isAnagram(s, t):
    return Counter(s) == Counter(t)

# Or manual counting:
def isAnagram(s, t):
    if len(s) != len(t):
        return False
    
    char_count = {}
    
    # Count characters in s
    for char in s:
        char_count[char] = char_count.get(char, 0) + 1
    
    # Subtract characters in t
    for char in t:
        if char not in char_count:
            return False
        char_count[char] -= 1
        if char_count[char] == 0:
            del char_count[char]
    
    return len(char_count) == 0

Time Complexity: O(n) - Linear scan Space Complexity: O(k) where k = unique characters

8. Contains Duplicate (LeetCode 217)

Problem: Check if array contains any duplicates.

Approach 1: Hash Set

def containsDuplicate(nums):
    seen = set()
    
    for num in nums:
        if num in seen:
            return True
        seen.add(num)
    
    return False

Approach 2: Set Length Comparison

def containsDuplicate(nums):
    return len(set(nums)) != len(nums)

Time Complexity: O(n) - Single pass or set creation Space Complexity: O(n) - Hash set storage

Key Concepts:

  • Early Termination: Return immediately when duplicate found
  • Set Properties: Automatic duplicate removal
  • Length Comparison: Elegant one-liner solution

Core Algorithm Patterns from Today's Problems

1. Hash-Based Solutions (Primary Pattern)

  • When to use: Fast lookups, duplicate detection, complement finding
  • Examples: Two Sum, Valid Sudoku, Contains Duplicate, Longest Consecutive Sequence
  • Key insight: Trade O(n) space for O(1) lookup time
  • Implementation: Hash Sets for membership, Hash Maps for key-value pairs

2. Frequency Analysis (Primary Pattern)

  • When to use: Anagrams, character counting, top-k problems
  • Examples: Group Anagrams, Valid Anagram, Top K Frequent Elements
  • Tools: Counter, hash maps, bucket sort
  • Optimization: Linear time frequency-based solutions instead of sorting

3. Multi-Pass Array Processing

  • When to use: Complex calculations requiring multiple scans
  • Example: Product of Array Except Self (left pass + right pass)
  • Benefit: Maintain O(n) time while avoiding division or complex logic
  • Pattern: Accumulate information in one direction, then process in reverse

4. Set Operations for Uniqueness

  • When to use: Removing duplicates, finding unique elements
  • Examples: Longest Consecutive Sequence (duplicate removal), Contains Duplicate
  • Advantage: Automatic duplicate handling with O(1) operations
  • Common trick: Compare len(set(array)) vs len(array) for duplicate detection

Performance Optimization Tips

1. Choose Right Data Structure

# Slow: List membership testing
if item in my_list:  # O(n)

# Fast: Set membership testing  
if item in my_set:   # O(1)

2. Minimize String Operations

# Slow: String concatenation in loop
result = ""
for char in chars:
    result += char  # O(n) each iteration

# Fast: Join operation
result = ''.join(chars)  # O(n) total

3. Early Termination

# Stop as soon as condition is met
for item in items:
    if condition(item):
        return True  # Don't continue unnecessary iterations
return False

4. Space-Time Trade-offs

# Time-optimized (more space)
seen = set(nums)  # O(n) space
return len(seen) != len(nums)  # O(n) time

# Space-optimized (more time)  
nums.sort()  # O(1) space (in-place)
for i in range(1, len(nums)):  # O(n log n) time
    if nums[i] == nums[i-1]:
        return True

Common Pitfalls and Solutions

1. Index Out of Bounds

# Dangerous
for i in range(len(arr)):
    if arr[i+1] > arr[i]:  # Error when i = len(arr)-1

# Safe
for i in range(len(arr) - 1):
    if arr[i+1] > arr[i]:

2. Modifying List While Iterating

# Dangerous
for item in my_list:
    if condition(item):
        my_list.remove(item)  # Changes list size during iteration

# Safe
my_list = [item for item in my_list if not condition(item)]

3. String Immutability Oversight

# Inefficient for large strings
def reverse_string(s):
    result = ""
    for char in reversed(s):
        result += char  # O(n²) due to string immutability
    return result

# Efficient
def reverse_string(s):
    return s[::-1]  # O(n) built-in operation

Conclusion

Arrays and strings form the foundation of algorithmic problem-solving. Key takeaways:

Core Concepts Mastered:

  • Array mutability vs string immutability and their performance implications
  • Static vs dynamic arrays and when to use each
  • Time complexity analysis for common operations
  • Hash-based optimizations for O(1) lookups

Essential Patterns Mastered Today:

  • Hash-based lookups (Two Sum, Valid Sudoku, Contains Duplicate)
  • Frequency analysis (anagrams, Top K problems)
  • Multi-pass processing (Product Except Self)
  • Set operations (duplicate detection, uniqueness)

Problem-Solving Strategies:

  • Choose appropriate data structures based on operation requirements
  • Consider space-time trade-offs for optimization
  • Use early termination when possible
  • Leverage built-in functions for common operations

Next Steps:

Building on this foundation, future topics will cover:

  • Linked Lists and Pointers
  • Stack and Queue Applications
  • Tree and Graph Traversals
  • Dynamic Programming Patterns

The problems covered here represent fundamental patterns that appear across technical interviews and real-world applications. Master these concepts, and you'll have a solid foundation for tackling more complex algorithmic challenges.


References:

This comprehensive guide combines theoretical understanding with practical problem-solving, providing a solid foundation for data structures and algorithms mastery.